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A STUDY OF ONE COMPANY’S CRITERIA FOR 
SELECTING COLLEGE GRADUATES’ 


A. P. JOHNSON 
Purdue University 


HE selection of college graduates who will remain .and 
create jobs for themselves within their company is a 
present-day practical problem for hundreds of busi- 
ness and industrial concerns. 
One company in the field of national merchandising con- 
siders each of its better applicants on an almost clinical basis, 
as worthy of thorough all around study. 


CRITERIA OF SELECTION 


As a minimum the following data were obtained on eighty 
selected applicants during a recent four-year period: 


1. Seore on the Otis Intelligence Test.? 
2. Seore on the O’Connor English Vocabulary Test.° 
3. Rating on ‘‘Family Background’’ as judged from appli- 
cation blank (father’s occupation, number of executives 
and professional people among relatives, cultural attain- 
ments of home, ete.). 
4. Rating on Hard Work or Industriousness as judged from 
application blank (high school and college grades, per 
1The author is indebted to Dr. Joseph Tiffin, Associate Professor of 
Research in Industrial Psychology at Purdue University for numerous and 
extremely helpful suggestions in the compilation of the data for, and the 
preparation of, this paper. The author is indebted also to Dr. Tiffin, Mr. 
R. J. Greenly and the four members of the Industrial Psychology class: 
Miss Breemes, Messrs. Smith, Foltz, and Gewirtz, who gave freely of their 
time and effort in making the ratings herein reported. 
Needless to say the courtesy of the company in making these data avail- 
able is deeply appreciated. 
2 Otis S-A, Higher, 20-minute limit. 
’O’Connor’s Worksample 95, Form AB. 
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cent of expenses earned, activities and sports partici- 
pated in, ete.). 

Rating on Extroversion-Introversion as judged from 
school, outside of school, and work activities from inter- 
view and in some cases from ratings on Bernreuter Per- 
sonality Inventory and Strong Vocational Interest Blank, 
Rating on Writing Flair, based upon school positions, 
work and other positions held in which writing was 
important, upon judgments of samples of writing sub- 
mitted, ete. 

The Otis test and the O’Connor vocabulary were used as a 
preliminary screening device, such that only rarely was an 
applicant favorably considered when his Otis Test score was 
less than 50 or his O’Connor Test score below 110. 

These values, essentially those tentatively recommended by 
the company’s personnel staff trained in psychological tests 
and measurements were justified by the process of trial and 
error. Hardly ever did a graduate who received a score below 
either of these levels prove thoroughly acceptable to his super- 
visors. The other qualifications of applicants during the 
four-year period who scored barely above these minima were 
usually scrutinized especially carefully. 

The Otis and O’Connor tests are well standardized instru- 
ments whose reliability and probable validity has been rather 
well established.‘ 

The management had long considered the intangible qualities 
of good Family Background and good habits of work as im- 
portant qualifications of college graduate applicants to their 
company. The applicants who were hired were expected to 
go through a period of intensive training in personal selling 
and were to be trained later for selling through advertising 
and sales promotion media. It was considered desirable, there- 


4The Otis S-A Test of Higher Mental Ability due to its proved value 
and its ease and speed of administration and scoring is widely used today 
in industry. High scores on the O’Connor Vocabulary have been shown 
to be typical of high executives. See the author’s ‘‘A Study of the 
English Vocabulary Scores of 75 Executives,’’ Technical Report Number 
Two of the Human Engineering Laboratory, Hoboken, N. J., 1935. 
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fore, for the applicant to be at least of a near extroverted type 
and to have at least a moderate Flair for Writing. 

The study herein reported seeks to objectify, and to estab- 
lish estimates of the reliability and validity of the ratings used. 


METHOD OF GATHERING DATA ON APPLICANT 


The writer had been interested in this problem for some time 
during which he had personal access to the company’s file of 
applications from college graduates. Data were collected for 
applicants (some hired, some rejected) during the years 1935, 
1936, 1937 and 1938 on whom at least an application blank, 
interview record, Otis Intelligence and O’Connor Vocabulary 
tests scores were available. When the files had been reviewed 
for all of the applicants whose surnames began with A, B, 
C or D on whom such data were found the time left for 
gathering the data dictated that practically all of the remain- 
ing information concern only the applicants hired. 

The collected data were thrown into an envelope and laid 
aside for about a month. When the material was sorted out 
it was found to include forty-one who had been hired and 
thirty-nine who had been rejected. It is believed that by 
virtue of the method of selection used fairly representative 
groups of rejected and hired applicants were obtained. 

The following data gathered on one of the applicants indi- 
cate the form in which the bits of information were gathered 
from the application blank and other sources: 

Applicant #47 
Otis Raw Score: 70 
O’Connor Raw Score: 129 
Family Background: Father, college graduate, a journalist with a mod- 
erate sized N. J. evening newspaper. 
Evidences of Hard Work: B-plus grades in a large Eastern school, earned 
about three-fourths of school costs including scholarships. 
Evidences of Extroversion-Introversion: In high school an entrant in N. Y. 
Times oratorical contest; college dormitory chairman; Boys’ 
Camp counselor. 

Writing Flair: Associate Editor of college daily, Associated Press corre- 
spondent, wants to do advertising copywriting. 
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METHOD OF OBJECTIFYING RATINGS 


An Industrial Psychology class of six members became in- 
terested in the problem of objectifying the four ratings for 
each applicant on ‘‘Family Background,’’ ‘‘Hard Work,”’ 
**Extroversion-Introversion,’’ and ‘‘ Writing Flair.’’ 

Data like that above for applicant +47 were typed out on 
a 3x5 card, one for each applicant. All of the data except 
the Otis and O’Connor test scores (so as to eliminate any 
subjective biases due to these scores) were projected on a small 
screen. The six members of the class rated each applicant 
simultaneously on the four above mentioned characteristics 
using four categories for each: H (or 4) as highest; A (or 3 
as above average; B (or 2) as below average; and C (or 1) 
as lowest. Each class member was asked to use his own 
absolute scale based upon his own impression of the meaning 
of each characteristic. The individual ratings of any appli- 
eant by the different members of the class were not compared 
until all eighty applicants had been rated on all four traits. 
These ratings, ranging from a maximum value of four to a 
minimum of one, were totaled for the six raters to give a 
maximum total rating of twenty-four and a minimum total 
rating of six on any one characteristic. Combining the six 
ratings increased the reliability cf judgment as to these char- 
acteristics from the range of about 0.40 to 0.58 to a range of 
0.82 to 0.87 (see the section, Reliability of Ratings, following). 

Typical information resulting in highest, intermediate, and 
lowest total ratings is recorded in footnote’ beginning below 
and ending at top of page 258. This information defines 

5A rating of 24 on Family Background was given on the basis of the 
following data: 

(a) Fine cultured family; father detailing for Chemical Co.: 

grandfather, a prominent clergyman. 

(b) Father President of a small railroad; a governor of N. ©. in the 

family. 
A rating of 16 on Family Background was given on the basis of: 
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(a) Father former real estate and insurance salesman in ———————, 
N. D. 
(b) Father, deceased, a mechanical and electrical engineer; Mother 
tutoring and translating to buy them food. 
A rating of 10, the lowest given on Family Background, was on the 
basis of: 
(a) Lives on a rural free delivery route; family have only the barest 
means; stepfather a machinist; no professional people in family. 
(b) Father, a farmer; brother, a salesman; sister, a beauty operator. 
A rating of 24 on Hard Work was defined by the raters as follows: 
(a) Earned 75 per cent of school costs; B-plus grades; references say 
sincere, hard worker. 
(b) Earned 100 per cent of costs; B-minus grades. 
A rating of 16 on Hard Work was defined by the six raters in these ways: 
(a) B-plus grades; earned 30 per cent of costs. 
(b) Earned 20-25 per cent of costs; B-minus grades. 
A rating of 8 on Hard Work, the next to lowest rating, is defined by 
these data: 


(a) One reference concludes that he expects to succeeed without hard 
work, 

(b) C grades; earned one-fourth of costs; has held summer laboring 
and minor jobs apparently obtained through his father. 

A rating of 24 on Extroversion-Introversion is represented by: 

(a) President of social fraternity, Hi-Y club, college student body, 
and student executive board; Bernreuter Personality Inventory 
rates him more extroverted than 95 per cent adult men and more 
socially aggressive than 99 per cent adult men. 

(b) Campus leader; ‘‘Man of the Year’’ at his university; Bernreuter 
personality rates him 96 per cent and 99 per cent, respectively, 
on extroversion and social aggressiveness compared with adult 
men; Strong’s Interest Blank rates him ‘‘A’’ as life insurance 
salesman, personnel manager, ‘‘ Y’’ physical director, office clerk, 
and vacuum cleaner salesman. 

rating of 16 on Extroversion-Introversion is represented by: 

(a) Varsity baseball and basketball player; does not enjoy dancing, 
bridge, cocktails; no social fraternity; no selling experience, 
Bernreuter Inventory rates him more extroverted, more socially 
aggressive than 76 per cent adult men; Strong’s blank rates 
him ‘‘A’?’ as life insurance, real estate, and vacuum cleaner 
salesman. 

(b) In high school plays; has given two talks over the radio in Pitts- 
burgh; says shortcoming is shyness; assistant to the publicity 
director of a Pennsylvania college. 
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A rating of 10 on Extroversion-Introversion is based on: 

(a) Two months selling with father’s concern; two references say, 
‘personality has to grow on one, is not winning’’; another 
says, ‘‘like his father, a successful plugger’’; glee club business 
manager. 

(b) Bernreuter Inventory rates him more extroverted than 62 per cent 
adult men, more socially aggressive than 74 per cent; he rates 
B-plus as Journalist on Strong’s blank; has done nothing to dis 
tinguish himself; apparently never more than a private in cadet 
military unit. 

A rating of 24 on ‘‘ Writing Flair’’ was defined by the raters as follows: 

(a) Editor of high school current events paper; head of (large Eastern 
university) publicity department. 

(b) Biographical sketches for Time Magazine; on staff of three (an 
other large Eastern university) publications; chairman of th 
daily paper. 

A rating of 15 on ‘‘ Writing Flair’’ was defined by the six raters as 
follows: 

(a) Has written radio continuity; has helped organize and run colleg: 
magazine; wants journalistic work. 

(b) Editor of college yearbook and paper; English major. 

A rating of 6 on Writing Flair was given when there was no evidencé 
whatsoever of any such talent. 


more accurately than other words could the characteristics 
inferred by the six raters. In short these statements are 
fairly precise and fairly objective definitions of varying de- 
grees of the traits rated. 


RELIABILITY OF RATINGS 


The maximum rating on any trait is twenty-four which 
represents the highest possible rating (four) given by every 
one of the six raters. The minimum rating is six, the sum 
of six lowest possible ratings of one each. 

The highest correlation (0.58 by the contingency method’ 
between the ratings of any two members of the class on 
‘“*Family Background’”’ was that between ratings by the two 
oldest members ; the lowest (0.41) occurred between ratings by 
one of the older and one of the younger members of the class. 


6 The contingency method was used inasmuch as these ratings were 
essentially qualitative (not quantitative) data. 
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However, when the total of ratings by three judges on 
‘‘Pamily Background’’ was compared with the total ratings 
on that trait by the remaining three judges the correlation 
between them by the contingency method’ was 0.76. Accord- 
ing to Garrett for a 5x5 fold classification (as used here) 
““C,’’ the contingency coefficient of correlation, may be taken 
as roughly equivalent to ‘‘r,’’ the product moment coefficient 
of correlation. The reliability of the total ratings by the six 
judges combined as estimated by application of the Spear- 
man-Brown prophecy formula to this correlation of 0.76 is 
0.86 (on ‘‘Family Background’’), a significantly high figure 
for ratings on a somewhat intangible characteristic. 

The same method was followed in estimating the reliabilities 
of the ratings on the other three traits. 


VALIDITY OF RATINGS 


Three general methods of exploring various aspects of the 
validity of the ratings were used, viz.: (a) Intercorrelations 
of test scores and ratings, (b) Comparison of the group hired 
and the group rejected, and, (c) Correlation between test 
scores and ratings and service or merit ratings on the job. 


(a) Intercorrelations of Test Scores and Ratings 


The intereorrelations between test scores and ratings on 
the four traits are given in Table 1. Correlations along the 
diagonal are reliabilities. 

The reliabilities of the Otis and O’Connor tests as well as 
the correlation between them are product moment correla- 
tions; the remaining coefficients have been obtained by use of 
the contingency method with the eighty cases of this study. 

The highest intercorrelation is that between O’Connor Vo- 
cabulary scores and ratings for ‘‘Hard Work’’ in the college 
years. The tendency to put forth extra effort in the college 
situation may have some relationship to having or acquiring 
vocabulary knowledge in those years. 

7See H. E. Garrett, Statistics in Psychology and Education. New 
York: Longmans, Green and Co., 1926, pp. 195-203. 
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TABLE 1 





O’Con- Family Extrov.- Writing 
nor Bekgrd. Introv. Flair 


Otis Score 
O’Connor 
Family 
Backgrd. . ‘ 0.86 
Hard 
Work of A - 0.38 0.87 
Extrov.- 
Introv. 0.38 — 0.34 0.36 0.46 0.82 
Writing 
Flair 0.41 0.49 — 0.42 0.48 0.46 0.87 


* See Arthur S. Otis, Manual of Directions and Key (revised) for Otis 
Self-Administering Tests of Mental Ability, Intermediate and Higher 
Examinations. Yonkers, N. Y.: World Book Co., 1928. 

t P.E., equals 0.04. N equals 80. 

t See Johnson O’Connor, Psychometrics. Cambridge, Mass.: Harvard 
University Press, 1934, p. 124. Data for 400 nearly unselected persons. 








The next highest correlation, that between O’Connor Vo- 
cabulary scores and ratings on ‘‘ Writing Flair,’’ might reason- 
ably have been anticipated. 

Some of the remaining correlations furnish a basis for in- 
teresting speculation. 


(b) Comparison of the Group Hired and the 
Group Rejected 


The group hired by the company average measurably higher 
than the group rejected, on the following four characteristics 
as shown by Table 2: 


Otis S-A Higher (20 min.) form A test scores. 
O’Connor Vocabulary (wks. 95, form AB) test scores. 
Ratings of ‘‘ Extroversion-Introversion.’’ 

Ratings of ‘‘ Writing Flair.’’ 


As Table 2 shows, the differences between the medians of the 
two groups satisfy the conventional statistical standard of a 
completely reliable obtained difference (the ratio, difference 
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over P.E. of difference is at least 4)* in the cases of items 1, 2, 
and4above. The chances are 99 in 100 that the true difference 
between medians on items 3, ratings of ‘‘ Extroversion-Intro- 
version,’’ is greater than zero.® 


TABLE 2 
Comparison of the Group Hired and the Group Rejected 








: sous Group Group Difference over 
Comparison criterion , ; : 
— hired rejected P.E. of difference 


Otis S-A Higher (20 Min.) 
form A Scores 
Median 


= Range 

tis O’Connor Vocabulary, wks. 

her 95, form AB scores 
Median 


Range 
yard ‘Family Background’’ 
ons. . ratings 
Median 
Vo- Range 


‘‘Hard Work’’ ratings 
Median 
Range 


son- 


 I- ‘‘Extroversion-Introversion’’ 
ratings 
Median 
Range 
‘‘Writing Flair’’ ratings 
Median 20 12 
Range 
Number of Cases 41 39 


Table 2 shows no significant differences in ratings on 
Family Background’’ between those hired and those rejected. 


‘H. E. Garrett, op. cit., p. 136. 
'H. E. Garrett, op. cit., p. 135. It is realized that at best this statis- 
f the ' tical measure of significant differences is indicative only, inasmuch as it is 
: based on the assumption of a normal distribution of scores or ratings while 


of a these distributions are not normal but in general rather flat and moder- 
rence ately skewed. 
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It shows a difference of doubtful significance in ratings on 
‘*Hard Work’”’ between those hired and those rejected. 


(c) Correlations between Test Scores and Ratings 
and Service or Merit Ratings on the Job 


The correlations (rank difference method) between service 
or merit ratings on the job (final rankings by company offi- 
cials) and each of the six criteria were found to be as given in 
Table 3 below. 

TABLE 3 





_—— Y lati 
Reliability Fiterion with 
of criterion " a 

ratings on the job 


Criterion 





Otis S-A Higher (20 Min.) Form 

A scores 0.92 0.27 + 0.14 
O’Connor Vocabulary, wks. 95, Form 

AB scores 0.95 0.17 + 0.14 
‘*Family Background’’ ratings 0.86 0.40 + 0.12 
‘*Hard Work’? ratings 0.87 — 0.15 + 0.14 
‘“Extrov.-Introv.’’ ratings 0.82 — 0.22 + 0.12 
‘*Writing Flair’’ ratings 0.87 0.09 + 0.14 





* The reliability of the service or merit ratings on the job is not known 
but it is presumed to be high as they represent the collective judgment of 
five experienced company executives as to the general quality of each 
man’s work over a period varying from one to three years depending on 
the date the man was hired. Unfortunately comparable data on only 
twenty-three of the forty-one persons hired were available. 


That the correlation between success on the job and the 
ratings on ‘‘Family Background’’ falls between 3 and 4 times 
its Probable Error is in all probability an indication of a sig- 
nificant relationship between these two.*° 

Of the remaining correlation coefficients, that between suc- 

10 To judge from the definitions of high and low ratings on ‘‘ Family 
Background’’ as given in the earlier footnotes this factor represents 4 
composite rating on socially accepted business or professional success, 


economic success, and apparent cultural background of the applicant’s 
family. 
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cess on the job and Otis S-A Higher (20 min.) form A test 
scores is the highest, although all are so close to the value of 
their Probable Errors as to indicate little predictive sig- 
nificance. 

Reference to Table 2 shows that among those hired the range 
of ratings on ‘‘ Family Background’’ is 10 to 24 and the median 
is 20. Thus there is some tendency for the ratings to ‘‘ pile up’’ 
at one end of the scale. The range of ratings on ‘‘ Writing 
Flair’’ is 6 to 24 and the median 20 for those hired while for 
those rejected the range is 6 to 23 and the median 12. On this 
characteristic there is a definite ‘‘piling up’’ at the low rating 
end for those rejected. This extreme ‘‘piling up’’ on the high 
end of the seale for ratings on traits other than ‘‘ Family Back- 
ground’? may be a partial explanation for the low and not 
statistically significant correlations against the service ratings. 
Similarly the spread of the ratings on ‘‘Family Background’’ 
may in part account for the higher and statistically significant 
correlation obtained between these ratings and the service 
ratings. 

Should ratings of ‘‘Family Background’’ be more empha- 
sized in the selection process the range of ratings on that char- 
acteristic may be expected to be depopulated at the lower end 
and perhaps narrowed somewhat as well for those hired. If 
this should happen, a future correlation between such ratings 
and suecess on the job would in all probability be lower than 
the present figure as a resuit of the narrowing of the distribu- 
tion of ratings." The more rigorous selection on the basis of 
ratings on ‘‘Family Background”’ should in all probability 
insure a higher average level of service or merit ratings among 
those hired. 


CONCLUSIONS 


1. It is possible to objectify and to define fairly well the 
intangible characteristics of ‘‘Family Background,’’ ‘‘Hard 
Work,’ “‘Extroversion-Introversion,’’ and ‘‘ Writing Flair.”’ 


E. FP, Lindquist, A First Course in Statistics. Boston: Houghton 
Miffin Company, 1938, pp. 179-182. 





264 A. P. JOHNSON 


2. It is possible from a relatively small amount of objective 
information for 80 cases to make ratings of acceptable statis- 
tical reliability (0.82 to 0.87) on these characteristics if ratings 
by six or more reasonably well educated persons on each case 
can be combined. 

3. Under the selection procedure at the time of this study 
ratings on ‘‘ Writing Flair’’ most markedly differentiated the 
group hired from those rejected. 

4. In this company the combined ratings on ‘‘ Family Back- 
ground’’ are significantly related to service or merit ratings. 

5. The group hired did not differ measurably in ratings on 
‘Family Background’”’ from those rejected. 

6. The company could improve its selection procedure by 
making a more rigorous selection on the basis of ratings on 
‘Family Background.”’ 





THE VOCATIONAL INTERESTS OF 
PROFESSIONAL WOMEN 


Parr IT 


MARGARET SEDER 


The Educational Records Bureau 


N the preceding part of this paper, evidence was presented 

to substantiate the hypothesis that the interests of men and 

women engaged in a given occupation are the same. The 
following analysis represents an attempt to determine how 
closely the interests of the criterion groups of men and women 
in occupations of the same name, used by E. K. Strong, Jr., in 
developing the scoring keys for the Vocational Interest Blank 
and the Vocational Interest Blank for Women, resembled each 
other. This was done by analyzing weights given each item 
by each scoring key. 

It was thought desirable, however, first to ascertain whether 
the response made on one blank differed from the response to 
the same item on the other blank for any large number of 
people. For this analysis, there were available a blank for 
men and a blank for women filled out »y each of 60 women 
physicians and 69 life insurance saleswomen.? 

A tabulation was made of the response made to each item 
on each blank and then responses to the 268 common or very 

1 Assistance in preparation of the materials reported in this part of the 


study was furnished by the personnel of the Works Progress Administra- 
tion Official Project No. 465-71-3-36. 

2 The other women making up the group of 100 physicians and 100 sales- 
women who served as subjects in Part I of this investigation, filled out the 
women’s blank and a mimeographed blank containing the items occurring 
only on the men’s blank. This procedure did not result in any closer 
agreement between ratings on the men’s and women’s blanks than when 
both complete blanks were used. 
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TABLE 1 


Frequency Distribution of the Number of Times Each of 268 Items Which 
Occur in Both the Vocational Interest Blanks Was Answered 
Differently on the 2 Blanks by 60 Women Physicians 
and 69 Life Insurance Saleswomen 





Number of Times Frequency ay oe 
Items Was Altered (Physicians) Saleswomen) 


30 
29 
28 


27 


. 
26 


25 


5S —_ 


ew bo 
Nwowon » we 


° 
9 
= 
9 
a 
‘ 
5 
9 


— 
aQ -» 


14 
21 
28 
15<Median 
22 
20 
22 
18 
16 
17 
4 


wo - oO 


Oo bo 


Total 268 
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similar items were compared to see whether they were the same 
or different. Frequency distributions of the number of times 
an item was responded to differently on the two blanks are to 
be found in Table 1. A count was also made of the total num- 
ber of items altered by each individual, and these frequency 
distributions appear in Table 2. 


TABLE 2 
Frequency Distribution of the Total Number of Times Each Subject 
Altered His Response to Common Items 


Frequency 
(69 Life Insurance 
Saleswomen ) 


Total Number of Frequency 
Alterations (60 Physicians) 


105-—109.9 1 
100 — 104.9 
95-— 99.9 
94.9 
89.9 
84.9 
79.9 
74.9 
69.9 
64.9 
59.9 
54.9 
49.9 
44.9 
39.9 
34.9 
29.9 
24.9 
19.9 


4 
5 


Oo -» 


oo 


5 <— Median 


oo 
=r) 


<— Median 


pons. tb oO © 


Total 60 





The first thing to be noticed in Table 1 is that the item 
changed most frequently was altered by less than half of the 
subjects in each group. This item, which was changed 29 
times by each group, is one which is not really the same on the 
two blanks. It occurs on the men’s blank as ‘‘social problem 
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movies’’ and on the women’s blank as ‘‘movies.’’ The items 
were compared because the connotations of the other three 
types of movies mentioned on the men’s blank—travel, educa- 
tional, and cowboy movies—were thought to be still farther 
removed from the connotation of ‘‘movies.’’ It is quite pos- 
sible that this item should not be considered. Whether or not 
it is included, no item is changed by as many as half of either 
group, and the median per cent of times the response to any 
item is altered is 16.7 for physicians and 17.4 for life insurance 
saleswomen. 

It is of some interest to examine the extremely stable or un- 
stable items. The two items changed by none of the physicians 
are ‘‘ Art galleries’’ and ‘‘ Work which interests you with mod- 
est income’’ versus ‘‘ Work which does not interest you with 
large income.’’ These items are also among the most stable 
for the saleswomen. ‘‘Modern languages,’’ ‘‘Usually drive 
myself steadily’’ and ‘‘Worry considerably about mistakes’ 
were each responded to in different ways only by one physi- 
cian. Relatively unstable items for physicians include ‘‘Op- 
portunity for promction”’ as a factor affecting choice of work, 


‘*Operating machinery,’’ ‘‘School teacher’’ and ‘‘ Athletic men 


9 


or women. Additional stable items for insurance saleswomen 


include ‘‘Life Insurance salesman,’’ ‘‘Raising flowers and 
vegetables’’ and ‘‘Contributing to charities.’’ Unstable items 
include ‘‘Scientific research worker,’’ ‘‘ Expressing judgments 
publicly regardless of criticism’’ and ‘‘ Develop plans’’ versus 
‘*Execute plans.’’ It should be made clear that when the re- 
sponse to an item was altered, it was almost always a change 
between one of the extremes and the middle position, 7.¢., from 
like to indifferent or vice versa rather than from like to dislike 

When we consider the number of items altered by each sub- 
ject (see Table 2), we find that some subjects change their 
responses six or seven times as frequently as others. Whether 
this is just carelessness or whether by the time the item 0- 
curred the second time, a slight change in mood had occurred 
to alter the response it is not possible to say. Moreover, some 
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subjects may have referred back to the first blank to see what 
response was made previously. The directions to ‘‘ Work rap- 
idly. Your first impressions are desired here,’’ probably held 
such comparison of responses by the subject to a minimum. 
The median physician altered 17 per cent and the median sales- 
woman altered 19 per cent of the items. 

These data suggest that responses to items occurring on both 
blanks are essentially the same. Thus the test re-test reliabil- 
ity of the blanks is probably quite satisfactory. This being so, 
we may expect the weights for the scoring keys to be fairly 
well established, and it seems legitimate to assume that we are 
comparing the vocational interests of Strong’s criterion groups 
when we compare weights given to common items on the two 
blanks by keys of the same name. The keys studied were 
dentist, physician, artist, lawyer, life insurance sales, teacher, 
Y.M.C.A.-Y.W.C.A. secretary, office clerk-stenographer, and 
purchasing agent-office worker. 

The comparison could be made only after the scoring weights 
on the men’s key were reduced by means of the proper formula 
to compare with the weights on the women’s keys. Then three 
scatter diagrams, showing the relationship of the weightings 
for each of the three possible responses, ‘‘like,’’ ‘‘indifferent,’’ 
and ‘‘dislike,’’ were made for each pair of scoring keys. The 
heavy weightings were found to occur almost always in the 
‘‘like’’ or ‘‘dislike’’ categories, and the ‘‘indifferent’’ responses 
tended to be weighted rather lightly. The lawyer keys ap- 
peared to show somewhat less tendency to covariation than did 
the other keys, but on the whole, the tendency to association 
was marked. 

For the 268 common items, then, the interests of these nine 
criterion groups of men and women in the same occupations 
were rather similar. It seemed possible, however, that items 
which oceurred only on one blank, and so might be considered 
especially significant for the sex for which the blank was de- 
signed, might be weighted more heavily than were the common 
items. If that were true, these different items might well form 
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the basis for sex differences in occupational interest patterns. 
To test this hypothesis, a count was made of the number of 
weightings of +4, +3, +2, +1 and O occurring among 
common items and the number of such weightings occurring 
among different items for each key. Each item had three 
weights for each key, one for each possible response to the item. 
Table 3 shows the distributions of weightings and the prob- 
ability that the two distributions for each key for each blank 
are samples drawn from an homogeneous population. 

In all cases, if any difference occurs at all, the more heavily 
weighted items are the common items. For the men’s blank, 
there appear to be significant differences (p < .05) in 6 of the 
9 keys but no difference for lawyer, life insurance salesman, 
or purchasing agent-office worker. For the women’s blank, 
only two keys show differences of such a magnitude as to be 
significant, namely, lawyer and Y.W.C.A. secretary. The 
M-F key which is based only on common items for the women’s 
blank was analyzed for the significance of the difference be- 
tween the distribution of weightings for common items and for 
items oceurring only on the men’s blank. Table 4 shows these 
distributions, which seem to differ significantly, the frequency 
of + 3’s being greater among common items and the frequency 
of + 2’s being greater among different items than is to be 
expected by chance. 

It appears that the common items are the more valuable— 


TABLE 4 


Distribution of Weights Given Common Items and Items Occurring Only 
on Men’s Blank by the M-—F Key 
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that is, contribute more to the total score on the men’s keys— 
than the ‘‘different’’ items in most cases, and when the 
women’s blank was being constructed, more of the heavily 
weighted items for these keys were saved than were discarded. 
It also seems that the items which were added to the common 
ones to make up the women’s blank contribute about the same 
amount, proportionately, as the common items. 


SUMMARY 


1. The Vocational Interest Blank and the Vocational Inter- 
est Blank for Women appear to be quite reliable as is indi- 
cated by the fact that in responding to items twice, the median 
number of responses altered was about 18 per cent. 

2. There is substantial agreement between the weight given 
to the response to each item by the men’s key and the weight 
given by the women’s key for the same occupation. 

3. Items occurring on both blanks are as heavily or more 
heavily weighted than items occurring only on one blank. 


CONCLUSIONS 


Analysis of the scores of two groups of professional women 
on the Vocational Interest Blank and the Vocational Interest 
Blank for Women and of the scoring keys for these two blanks 
shows that the interests of men and women engaged in the 
same occupation tend to be similar. Differences do occur ina 
few cases, however. The efficiency of interest measurement 
might be increased by composing a blank to be used with both 
sexes, giving it to men and women in all occupations, and com- 
paring responses for men and women in occupations pursued 
by both sexes. Where sex differences were significant, a key 
for each sex should be constructed. Probably in many occu- 
pations, a single key would suffice. At least all indications of 
this study are that differences between sexes in an occupation 
are usually less frequent and less important than similarities 
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MOTIVATIONAL PROBLEMS IN STUDENT 
COUNSELING 
D. D. FEDER ann J. 8S. KOUNIN 
University of Illinois 


EFLECTING an insistent and growing trend in modern 
psychological thinking, Murphy has said: ‘‘ Motivation, 
instead of being a special problem, a rather interesting 

hobby for those who wish to leave the psychological highroad, 
is a central problem in relation to which others must be seen.’’ 
The central importance of the study of motivation derives from 
the fact that it goes beyond descriptive statements and at- 
tempts to arrive at explanatory statements. That is, it 
attempts to ascertain not only what an individual is doing, but 
why he behaves as he does. Whereas most correlational 
studies reveal only static relationships, the study of motiva- 
tional components deals with causal relationships in behavior. 
Only a knowledge of causal relationships makes it possible to 
predict, and hence to direct and control, the behavior of an 
individual in a conerete situation. 

In almost every counseling situation the counselor strives in 
some manner to modify or intensify behavior. To do this 
requires that the counselor analyze the causes of a student’s 
behavior, and attempt to manipulate the causal or motivational 
pattern in order to change behavior. Discussing the general 
unsatisfactory status of knowledge concerning problems of 
motivation of college students, Williamson points out that in 
the absence of meaningful and reliable studies, the counselor 
must inspect all relevant data concerning a case, and then, 
‘From these data and the student’s verbal reports the coun- 
selor must infer a causal relationship with the student’s 
behavior.’ Although such inferences may always be a neces- 


1E. G. Williamson, How to Counsel Students. New York: MecGraw- 
Hill, 1939. 
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sary part of the counseling procedure, and are certainly 
acceptable to a certain degree in scientific procedure, the ulti- 
mate test of their validity is not merely the student’s evalua- 
tion of relevancy, but the isolation and experimental study of 
these inferred causes. 

So little has been done in the way of applying techniques of 
dynamic experimentation that it is impossible to indicate at 
this time which procedures may be used with confidence and 
which ones have been fruitless. 

This paper will present briefly some of the better established 
generalizations with regard to student motivation; report the 
results of several exploratory studies in which a more nearly 
dynamic approach has been attempted ; outline those areas and 
factors in the motivational process which are significant to the 
counselor as he attempts to understand the individual student; 
and finally indicate some of the directions in which further 
study is needed and may be expected to furnish valuable data 
for counseling purposes. 

Because it is the most readily available, and frequently most 
diagnostic of the criteria of student adjustment, the grade 
point average is the most widely used criterion in student per- 
sonnel studies. As a matter of fact, there are other equally 
important behavior manifestations which should be studied as 
criteria of adjustment, but the fact remains that a marked 
deviation between what a student can achieve, in terms of his 
abilities, and what he actually does achieve, in terms of school 
marks, is generally indicative of some sort of maladjustment. 
Not only is underachievement symptomatic, but overachieve- 
ment may also indicate an emotional or other imbalance in the 
life pattern of the individual. 

On the basis of available data, the following generalizations 
seem warranted : 

1. The highly motivated student shows greater corre 
spondence between ability and achievement than does the less 
well motivated student. 

2. Although there is some inconsistency, in general, studies 
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indicate that the possession of a definite career interest or 
vocational motivation tends to result in slightly superior 
achievement. 

3. A major factor which students hold responsible for their 
success in given courses is their interest in the work; however, 
the studies have not shown the possible sources of the interest 
which may be past experience and learning, vocational signifi- 
cance, ete. It is also well known that interest in a subject may 
result from successful experience with it. 

4. In several studies, students report that their reason for 
studying may be attributed directly to their desire for instruc- 
tor’s approval ; on a similar level is their desire for high grades 
in order to gain social approval. 

5. In most studies in which the criterion has been students’ 
reports of time spent on activities, it has generally been con- 
cluded that studies are most central, that is, of highest moti- 
vational intensity, to the student, and that contact with in- 
structors, extra-curricular, and social activities are somewhat 
peripheral, that is, have less motivational intensity. It may 
be questioned, however, whether time spent is as valid a cri- 
terion of centrality, as, for example, number of times one 
thinks about an activity in a given period of time. 

6. Students will study more, with greater spontaneity, and 
with greater personal satisfaction in subjects which are more 
central to them—electives and majors—than in courses which 
are required. 

7. Rewards such as honors, special privileges, scholarships, 
Phi Beta Kappa, etc., have been regarded as important sources 
of motivation for the student, because he may really be proud 
of the standard of achievement which the award signifies, and 
also, because he may enjoy the social approval or attention 
which accompany the award. It should be noted in this con- 
nection that reward or punishment are not necessarily effective 
as motivators per se, but rather because of more fundamental 
psychological needs which they satisfy. 

8. Students who must do some work for self-support, work 
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toward their goals of academic success, with the hope of con- 
sequent vocational success, more consistently and with better 
results (in terms of grades) than students who do not have to 
do outside work. 

9. Students who participate in extra-curricular activities 
show higher achievement scholastically than students who do 
not participate. However, there is no basis for assuming any 
kind of functional inter-relationship between extra-curricular 
participation and scholastic success as such. There is some 
evidence to indicate that students feel that extra-curricular 
activities have a direct life significance correlated with their 
vocational goals. If such should be the case, then the finding 
would bear a logical relationship to the previously noted 
generalization on the effect of having a definite vocational goal. 

Meaningful study of motivation requires that the experi- 
mental analysis be made under conditions which approach as 
nearly as possible the dynamic relationships of true life be- 
havior. This problem is especially aggravated in working with 
college students, because their maturity makes for such a de- 
gree of sophistication that they are disposed to question an 
experimental situation which does not have direct meaning for 
them. Although the child may be induced to enter an experi- 
mental situation wholeheartedly as a play experience, the 
college student must be shown a meaningful purpose, or else 
he enters the situation in a frankly ‘‘guinea pig’’ state of mind, 
and the results are thereby colored. To illustrate some of the 
problems encountered, and some of the exploratory means 
being used in an effort to derive usable techniques both for 
research and counseling purposes, a few of our studies in this 
area are herewith presented. 

Marion T. Nagler, now at Penn College, Oskaloosa, Iowa, 
developed a controlled interview technique designed to deter- 
mine intensity and sources of marriage and career motivation 
of ninety-one junior transfer girls at the University of Iowa.’ 


2 This study, and the one with Mr. Wright, were made at the State 
University of Iowa under the direction of D. D. Feder. 
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= 


Negative and positive weights assigned to each response per- 
mitted the calculation of a career motivation and marriage 
motivation score for each subject. 


Consistent with the results of other studies, with ability held 
constant, it was found that the girls with stronger vocational 
motivation achieved better than those less well motivated, par- 
ticularly in those subjects which were recognized as bearing 
directly upon the vocation. The better achievement in the 
specific courses was largely responsible for the generally supe- 
rior achievement of the highly motivated groups. 

Although nearly all the girls showed a decided interest in 
the possibility of future marriage, those who had intentions of 
professional careers exhibited markedly less marriage motiva- 
tion than those who had chosen the teaching profession or busi- 
ness as their careers. Observation of the marital adjustments 
of their parents and friends was the chief influence determin- 
ing a girl’s decision concerning marriage. College preparation 
was regarded as a form of insurance in the event that they 
might have to become self-supporting at some later date. 

From the point of view of possible further experimentation 
one of the most significant findings was that the motivational 
ratings were almost as effective in the prediction of general 
achievement as the intelligence test scores. For predicting 
achievement in those subjects which the student recognized as 
valuable for vocational preparation, the motivation rating was 
slightly superior to the intelligence test. 

To study the effect of different goals upon students’ achieve- 
ment and insights, an experiment was conducted with students 
in first year college physics at the University of Iowa, with the 
cooperation of Mr. M. Erik Wright, research assistant in the 
department of psychology at the University of Iowa. In addi- 
tion to the usual criteria of achievement, an effort was made to 
evaluate insight into the subject matter as revealed by the 
student’s ability to make applications of it to everyday life 
and to his future college work. The students were divided into 
three groups, on the basis of their possible goals in taking the 
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course. One group consisted of those who enrolled in the 
course in order to meet the Liberal Arts College science re- 
quirement. A second group consisted of those who took the 
course as part of their pre-medical and pre-dental training. 
The third group consisted of those with ‘‘pure’’ science inter- 
ests, who took the course because of anticipated intrinsic 
values. 

Despite their superiority in ability, those who took the course 
as part of the Liberal Arts requirement, showed the poorest 
achievement. Although the pre-medical and _ pre-dentistry 
students earned slightly but not significantly better grades 
than did the pure science students, the latter group signifi- 
cantly exceeded both the others in quality of insight into the 
subject as exhibited by the practical applications they were 
able to make. The principle of individual differences in abil- 
ity as a conditioner of achievement has long been recognized. 
This study points to the fact that similar differences in moti- 
vation may be equally significant conditioners. Certainly it 
is clear that just as the instructor may not assume absolute 
equality of ability among the members of a class, so also he 
must not assume absolute community of motivation. 

Two studies now in progress at the University of Illinois 
have attacked other aspects of the motivation problem. In 
one, an experienced counselor is studying the effects of nega- 
tive and positive stimulation through the interview upon 
achievement and other adjustments. Two matched groups of 
high ability freshmen and two matched groups of low ability 
freshmen are being used as subjects. One group of each pair 
was given positive stimulation, that is, told that their scholas- 
tic prognosis was good. The other group was given negative 
stimulation, that is, told that their scholastic prognosis was 
bad. Observations of each student’s reactions are being 
studied to determine effects of the information and the way in 
which it was received. The experimental groups are being 
compared with uncounselled control groups. A follow-up sur- 
vey of the students’ remembrances of and reactions to the 
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interview is being made at the present time. In all, it is hoped 
that this study will shed some light on the effects of knowledge 
of ability, positive as contrasted with negative stimulation, and 
perhaps most importantly, the effectiveness of a single coun- 
seling interview. 

For the purpose of suggesting appropriate counseling tech- 
niques as well as for the delineation of areas in need of further 
study, the following classification of the motivation problem 
has been prepared. It is hoped that from this classification it 
may be possible to draw a frame of reference which the coun- 
selor can use in the diagnosis of student problems which seem 
to have their roots in motivational maladjustment. Motivation 
is a resultant of many different factors not yet adequately 
explored or understood. In order to improve motivation, 
therefore, we must get at the constituent factors as they exist 
functionally in the motivational pattern of a given student. 

Psychologically the problem may be divided into three 
phases: First, the absence of needs; second, the absence of 
goals; and third, the barriers which condition psychological 
movement are too great. Under each of these headings we 
shall consider typical problems and possible solutions. 


I. THE ABSENCE OF NEEDS 


Due to the frustration of central (highly ego-involved) 
needs, a state of tension exists which is so diffuse that it 
prevents the crystallization of a concrete need, and the 
differentiation of a tension which would give rise +o 
action. A specific example of this is found in the case of 
the student who worries about home, studies, ete., and as 
a result finds himself unable to concentrate. Another 
example may be the boy who is left alone in his house on 
a week-end night, because everyone else is out on a date. 
Whereas he normally might indulge in some study, a 
“‘eoke,’’ and a bit of a bull session, under these conditions 
he vacillates from one thing to another, unable to find 
satisfaction and resolution of tension in any of the accus- 
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tomed activities. By helping the student solve his home 
problems, by working through the home to directly modify 
the worry-causing situations, the counselor may remove 
much of the diffuse emotional tension so that differentia- 
tion of a specific tension for appropriate action may 
ensue. 

Because much of the typical college curriculum is not 
related to the needs of any individual student, he ‘‘sees 
no use for it’’ and therefore sees no purpose in studying. 
In such cases the counselor may either work on the stu- 
dent’s attitudes in an effort to change them; he may sug- 
gest changes of courses which will provide a curriculum 
more adequate for the student in terms of his needs; he 
may attempt to restructure the student’s interpretations 
by indicating how the courses may be related to his needs; 
and, of course, we may hope for, some day, a restructur- 
ing of college curricula in terms of students’ actual needs. 
Frequently the student will have more potent needs in 
other directions which are not as acceptable scholastically. 
Thus, there is the student who comes to college with the 
idea that ‘‘a good time is more important than getting 
good grades.’’ From the college standpoint such a stu- 
dent is disoriented, although psychologically he may be 
well adjusted. Traditional procedure in such cases has 
been to moralize, give ‘‘ pep talks,’’ show examples of more 
desirable activity, or even to attempt to instill a sense of 
guilt or sin. From a psychological viewpoint, what the 
counselor desires to achieve is a complete resolution of the 
tension. Therefore, it is necessary to encourage the stu- 
dent to seek stronger satisfactions. The counselor may 
suggest a variety of activities which will enable the stu- 
dent to get plenty of wholesome good times, and satisfy 
the need which he feels in this direction. The counselor 


8 The Darleys have written significantly on this point in a paper en 
titled ‘‘The Keystone of Curricular Planning,’’ Journal of Higher Edu 
cation, January, 1937. 
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me may even reaffirm and justify this stronger need in order 
ify to prevent the more serious tensions which may arise from 

a feeling of guilt. When the tension has been resolved in 
tia. appropriate activity the student is then ready to recognize 
nav and respond to other needs. 


Il. THE ABSENCE OF GOALS 
not 


sees (Both with reference to courses and extra-curricular 

ing. activities ) 

stu- Although the problem in a few cases seems to be a complete 
sug- absence of goals, generally it is one of too many conflicting 
lum goals, so that the student is unable to determine an appro- 
- he priate direction of action and follow it through to the satis- 
‘ions faction of his needs. 


eds; A. Frequently the possession of intense interests in two some- 


what unrelated vocational areas prevents a channelization 
of direction. To assist the student in determining a 
direction for his activity, the counselor may work out 
paths which will, for the time being, lead to the satisfac- 
tion of both interests; or through further analysis of the 
student himself, it may be possible to determine that one 
goal is more appropriate than the other in terms of the 
student’s possibilities of success in a given area. 

Many students are beset by the problem of too many out- 
side interests, such as movies, dates, bull sessions, extra- 
curricular activities, ete. This condition is related to the 
condition of having needs not scholastically acceptable. 
Here again the counselor may follow the procedure of 
making the goals more definite. A well-established tech- 
nique is that of having the student record his activities 
for a few weeks as the basis for developing a time schedule. 
Psychologically, the time schedule makes for rigid struc- 
turization of the student’s life activities, permits less dif- 
per en fusion of tensions, which psychologically means greater 
er Edu- efficiency. 

All too frequently a student’s sub-goals, as illustrated by 
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specific courses, are not integrated with his major goals. 
Every counselor is familiar with the complaint that cer- 
tain courses ‘‘don’t lead anywhere.’’ Especially is this 
true of courses which are required, or which are psycho- 
logically ‘‘insignificant’’ to the student, or which involve 
irrelevant activities which the instructor has failed to 
integrate with the major purposes of the course. Here 
the counselor may have to restructure the student’s think- 
ing by showing the specific connections of apparently 
unrelated courses and activities to the main paths which 
the student is pursuing; or again he may have to change 
the student’s path, through a change in curriculum, for 
example, to one that does lead to the desired goals. 

A commonly encountered difficulty at the college level 
results from the lack of definition of goals. That is, the 
instructor does not make clear to the student what is ex- 
pected ina course. Here is a realm in which the counselor 
may work with and through the instructor. Experi- 
mental evidence indicates that well constructed and com- 
prehensive outlines of goals and objectives, even though 
they may make the course more elaborate and more diffi- 
cult, will actually lead to more and better achievement 
than vague and ill-defined courses. 

The feeling of a lack of progress frequently leads to a 
state of satiation akin to boredom. Every counselor is 
familiar with phenomena like ‘‘senior year boredom.’’ 
Such feelings are usually the result of psychological repe- 
tition, that is, the student feels he is just doing the same 
thing over and over and not getting anywhere. Satiation 
may likewise be due to the lack of definite goals, and is 
frequently encountered in the situation previously dis- 
cussed. A number of remedial procedures are appropri- 
ate at this point. First of all, it is necessary to give the 
student a feeling of progress. This is especially relevant 
in areas where progress is not too evident, such as reme- 
dial reading where the changes are imperceptible to the 
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student. Spaced study during which the student stops 
before he becomes completely satiated with the activity 
and then returns to it later will be effective. For some 
students it may be effective to suggest that they leave the 
field of study completely for a time, going to classes, and 
keeping up general appearances, but doing no studying. 
When the student then returns to the field he will encoun- 
ter a sufficient number of activities so that he will have no 
difficulty in observing progress as he catches up on missed 
assignments. 

The tendency of the student to accept certain group goals 
as substitutes for his own previously held individual goals 
may bring him into conflict. The group may induce per- 
sonal, educational, or vocational goals in areas in which 
the individual is not likely to succeed. Furthermore, 
within any one area the group may induce a level of aspira- 
tion which is above or below the individual’s optimum level 
of achievement. A common illustration of this is the 
‘“‘gentleman’s C’’ idea prevalent in certain closely struc- 
tured college social groups. Responding to this level of 
aspiration, a student of high ability may aspire to achieve- 
ment which is socially acceptable, but which, from the 
point of view of the counselor, represents human waste 
and inefficiency. To counteract such influences, the coun- 
selor must attempt to reduce the potency of the group in- 
fluence and increase the potency of the individual’s own 
goals in particular areas. The student may be led to see 
that his belongingness to the group does not necessarily 
involve the uncritical adoption of all its goals; or the 
counselor may appeal to the individuality need of the stu- 
dent, and thus attempt to increase his ego-satisfying goals. 
When direct motivation is impossible, resort to reward 
situations may be indicated. This is traditional educa- 
tional practice, despite the fact that it means the use of 
extrinsic rather than intrinsic motivation. Thus, the stu- 
dent works for grades or the approval of the teacher. It 
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is conceivable that the more direct contacts with instruc. 
tors provided by tutorial and similar systems of teaching 
may induce attraction values in the subject matter itself. 
so that the student may be stimulated to work for less arti- 
ficial goals than grades or smiles of approval. 


III. PSYCHOLOGICAL BARRIERS TOO GREAT 


When the barriers which a student encounters are so great 
that he finds himself incapable of surmounting or removing 
them, frustration results. Such a situation produces diffuse 
tension, which results in lack of direction, reduction in crea- 
tivity, cessation of activity, or other types of withdrawing 
behavior. 

A. A student’s level of aspiration, either vocationally or aca- 
demically, may be too high for his level of ability. Over- 
ambitious parents or social expectations of the community 
may force the student to attempt to follow a vocational 
pattern for which he is not suited ; or a student may aspire 
to A grades when his ability level precludes anything 
much better than C’s. Any achievement which is below 
the level of aspiration set is a failure for the student, 
although ‘‘objectively’’ he may appear ‘‘successful.’’ In- 
stead of hardening him to failure, such repeated failures 
may result in a ‘‘failure personality.’’ From a psycho- 
logical point of view, therefore, the best preparation for 
failure is a series of success experiences, where the achieve- 
ments are in keeping with the individual’s level of aspira- 
tion. The counselor may have to indicate to the student 
that other goals are more consistent with his abilities and 
interests. This may involve complete reorientation of 
the student’s vocational goals and concepts of values. 
And, most important, the counselor must help the student 
analyze .the situation so that he can see that the incom- 
patible levels of aspiration are frequently induced by 
others and do not correspond with his own wishes. 
Reducing the strength of the barriers may sufficiently 
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resolve the tensions so that the student may move freely 
through the psychological space. Thus the counselor may 
prescribe courses which will be easier or more psycholog- 
ically available to the student. The student may be taken 
through a step-by-step analysis of the barriers, removing 
or reducing them in order. This ‘is essentially the pro- 
cedure of diagnostic and remedial work, for example, 
‘eat wherein a reading difficulty may be removed in order to 
‘ing aid the student in his study of history or literature. Di- 
‘use rect help to the student by way of remedial study habits, 
rea- additional information, ete., may reduce the strength of 
ring barriers and facilitate his motivational adjustment. 
Among the most potent barriers which students encounter 
in are the restraining forces of fear of failure or fear of 
ates ridicule and disapproval. In the classroom this is most 
lie frequently encountered in the student who is unable to 
war recite when called upon, even though he may know the 
spire exact answer requested. A graduate student seen recently 
ine had feelings of such intensity that they caused a spread of 
de uncertainty which prevented him from making decisions 
at in so simple a situation as whether or not to accept a prof- 
i. fered cigarette. Here the counselor may work with the 
an instructor to plan success experiences, so that instead of 
ls, ridicule and disapproval, the student will meet praise in 
‘ her a sufficient variety of situations to increase his confidence 
ra in those areas. Such-barriers may operate to cause social 
spira- maladjustment for large numbers of students. For ex- 
alent ample, the student who does not know how to get ac- 
sand quainted in a new situation, who does not know how to 
—— start conversations, make dates, or act appropriately when 
ralues. with his associates, may develop a consistent pattern of 
udent withdrawal behavior in social situations. Here again, the 
incom- counselor may reduce the individual’s social maladjust- 
ed by ment by planning situations which will give him success 
experiences in social situations. 


ejently 


Every counselor can, from his or her own experience, multi- 
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ply and magnify the foregoing illustrations. However, the 
essential elements of needs, goals, and barriers are present in 
the adjustment pattern of every individual. Regarding the 
foregoing as the essential elements of diagnosis in the moti- 
vational situation of the student, the counselor then has a frame 
of reference for individual study comparable with the profiles 
of abilities, achievements, aptitudes, and interests. 

Research efforts have for the most part neglected the study 
of counseling, all of which is concerned, in a sense, with the 
motivational adjustment of the student. For this reason in 
much counseling we have had to be guided by inference from 
other areas, and the best judgment of experienced counselors. 
Yet, as has been the experience of every science, ‘‘common 
sense’’ judgments are not always substantiated by controlled 
research. 

Already our studies suggest possible reinterpretations of 
problems of student adjustment, so that instead of regarding 
eases of unpredicted behavior, scholastically and otherwise, as 
obscure behavior maladjustments, it may be more accurate 
psychologically to regard them as imbalances of needs and 
goals, in the presence of material or psychological barriers 
which prevent normal and adequate behavior expressions. As 
such, an entire new line of approach to research and therapy 
in personnel work is indicated. 
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RELATION OF MEASURED INTERESTS TO 
THE ALLPORT-VERNON STUDY 
OF VALUES 
THEODORE R. SARBIN anv RALPH F. BERDIE 


University of Minnesota 


INTRODUCTION 


HE interrelations existing between values as measured by 
the Allport-Vernon Scale’ (1, 2, 3) and occupational in- 
terests as measured by the Strong Vocational Interest 

Blank (4) are analyzed here for several reasons. In the first 
place vocational counselors and psychologists, as well as laymen, 
conceive of personality stereotypes which correspond to spe- 
cific occupations. The business man, the scientist, the social 
worker, and the school teacher are all supposed to have per- 
sonality characteristics which differentiate them from one an- 
other and from people in general. Wide individual differences 
within occupational groups are recognized. Nevertheless, the 
range of any personality measurement in one occupational 
group is probably as large as the range of that personality 
measurement in the total population. 

Some traits have been found which differentiate one occupa- 
tional group from another. These have been measured by 
Strong’s Vocational Interest Blank. The 400 items in this 
blank are too diverse and heterogeneous for us to determine 
adequately if the commonly accepted personality stereotypes 
are valid. This Blank allows us only to state that a person 
has made the same responses as have a certain proportion of 
people in a given occupation. These occupations are defined 
in terms of an empirically constructed scale which samples a 


1The phrase ‘‘Seale of Values’’ has been substituted for ‘‘Study of 
Values’’ in order to prevent confusion in meanings. 
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heterogeneous array of responses. The configurations of re- 
sponses serve as a basis for a theory of occupational interest. 
The Allport-Vernon Scale may offer us the opportunity to test 
the existence of these stereotypes in terms of value profiles. 
We may also be able to determine the variation of value scores 
within each occupation. 

A second, and more practical reason for carrying on this 
study is to investigate the possibility of substituting the 
Allport-Vernon Scale for the Strong Vocational Interest 
Blank. The complexity and cost of scoring the Strong Blank 
prohibits its wide use in many guidance and personnel pro- 
grams. Consequently these programs suffer from lack of ade- 
quate personality descriptions with regard to vocational inter- 
est. If significant relationships are found to exist hetween 
these two measures, then the Allport-Vernon Scale will have 
a definite use in vocational guidance. 

A third reason for studying the relationships between the 
two tests concerns the meaningfulness of each test. The 
Strong’s test was empirically devised. Any theory of interest 
related to it is developed from discoveries made from appliea- 
tions of the test. The construction of the Allport-Vernon 
Seale was based on an already existing typological theory of 
personality. The first test is founded on inductive principles 
and the second on deductive principles. Furthermore, the 
items.in one test appear to be strikingly different from the 
items in the other. The methods of responding to the test 
items are likewise different. In the Strong Blank, the subject 
merely signifies preference for each item. In the Scale of 
Values the subject responds by the order-of-merit method and 
by the method of paired comparisons. If we'find that groups 
distinguished by one test are also distinguished by the Other, a 
relationship between the tests is demonstrated. If this rela- 
tionship substantiates the theory of occupational stereotypes, 
both tests acquire new meaning. 


PROBLEM 


Do individuals who show given patterns of interest on the 
Strong Blank have distinctive profiles on the Seale of Values! 
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METHOD 


Fifty-two men students applying to the University of Min- 
nesota Testing Bureau for vocational advice were selected at 
random. Among other tests, they were given the Allport- 
Vernon Seale of Values and the Strong Vocational Interest 
Blank for Men, 1938 edition, form M. The latter test was 
scored for 26 occupational keys. A modification of the pattern 
analysis described by Darley (5) was applied to the Strong 
profiles. The occupational keys were grouped according to the 
results of several factor analyses. The groupings used here 
are as follows: 

I. artist . certified publie accountant 

psychologist 
architect 
physician 
dentist 


. mathematician . accountant 
engineer office man 
chemist purchasing agent 

banker 


. farmer . salesmanager 
mathematics and physical real estate salesman 


science teacher life insurance salesman 


’, Y.M.C.A. physical director VIII. advertising man 
personnel manager lawyer 
Y.M.C.A, secretary author-journalist 
city school superintendent 
minister 


Each subject received a letter grade of A, B+, B, B-, C+, 
or C on each of the 26 occupational keys. If an individual 
obtained a rating of B— or higher in one-half or more of the 
occupational keys in any one group, he was classified as hav- 
ing interest in that occupational group. If he received a 
rating of C+ or C in more than one-half of the keys in the 
group, he was classified as having no interest in that group. 
According to this classification scheme, if the scores of the 
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original criterion groups were taken, less than three per cent 
would fail to show a measured interest in their respective 
occupations. 

Scores were obtained for each of the subjects on the six 
values of the Allport-Vernon Scale. This scale purports to 
measure the relative prominence of six ‘‘interests or motives’” 
in personality. The values are based on Spranger’s classifica- 
tion of personality types. They are best described by the 
adjectives : theoretical, economic, aesthetic, social, political, and 
religious. 

The ‘‘theoretical’’ man seeks the ‘‘truth.’’ Since the inter- 
ests of the theoretical man are empirical, critical and rational, 
he is necessarily an intellectualist, frequently a scientist or a 
philosopher. His chief aim in life is to order and to systema- 
tize knowledge. 

The ‘‘economic’ 


? 


man is primarily concerned with what is 


practical and useful. The authors of the test suggest that this 
type ‘‘conforms well to the prevailing stereotype of the average 
American business man.’’ 

The ‘‘aesthetic’’ man regards ‘‘form and harmony’”’ as para- 


mount. He finds his chief interests in beauty and in the 
artistic experiences of life. 

The ‘‘social’’ man is a lover of people. ‘‘In the purest form 
the social interest is selfless and tends to approach very closely 
to the religious attitude.’’ He is not interested in people as 
a means to reach other goals but rather as an end in themselves. 
The scale which measures this value has the lowest reliability 
of any and the authors suggest that this classification may not 
be specific enough truly to describe this personality type (1, 3). 

The ‘‘political’’? man is one who desires power. He is not 
necessarily found in the field of politics but may appear in 
many vocations. He seems to be motivated chiefly by a thirst 
for ‘‘ personal power, influence, and renown.”’ 

The ‘‘religious’’ man is mystical. The religious experience 


2 Expressions in quotes are taken from the Manual of Directions of the 
Allport-Vernon Scale. 
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is to him the ultimate experience which life can afford. Seek- 
ing for unity is his chief characteristic. . 

Comparison of the raw scores of the six Allport-Vernon 
values were made between people having interest and people 
not having interest in each of the eight vocational groups. 
Thus it was possible to see if subjects who had interests of the 
technical group, the business detail group, or any other group 
were characterized by any special pattern of scores on the 


Allport-Vernon Scale. The mean value scores were compared 


for subjects showing measured interest and subjects not show- 
ing measured interest. Fisher’s ‘‘t’’ test (6) was used to 
determine the significance of the differences. Allport-Vernon 
profiles typical of subjects in each group were constructed. 


RESULTS 


The means and the significance of the differences between 
the means are shown in Table 1. 


TABLE 1 


Allport-Vernon Mean Scores for Eight Pairs of Interest Pattern Groups 
with the Significance of the Difference between Subjects with 
Interests and Subjects without Interests 
N= 52 


of group 

with without 
interest interest 
pattern pattern 


Interest Allport-Vernon 
Group Seale 


Theoretical 35.19 26.45 
Eeonomic 29.54 34.85 
Aesthetic 27.31 20.86 
Social 29.54 29.09 
Political 28.50 35.64 
Religious 29.92 33.12 
Theoretical 35.08 24.92 
Economic 31.29 34.86 
Aesthetic 23.16 22.08 
Social 28.87 29.39 
Political 30.45 35.82 
Religious 31.26 32.92 


Theoretical 29.74 25.31 


Economic 33.35 34.04 
Aesthetic 25.15 
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TABLE 1—(Continued) 


Mean Mean 
of group of group 
with without 
interest interest 
pattern pattern 


Interest Allport-Vernon 
Group Scale 


III. Social 29.40 28.62 
Political 32.91 36.69 
Religious 33.03 30.19 


Theoretical 26.50 29.86 
Economic 31.79 34.52 
Aesthetic 22.63 22.38 
Social 29.16 29.23 
Political 33.08 34.30 
Religious 36.84 29.71 
Theoretical 29.95 28.28 
Economic 34.09 33.37 
Aesthetic 23.36 
Social 29.00 
Political 35.23 
Religious 28.36 


Theoretical 27.61 

Economic 34.48 

Aesthetic 20.84 

Social 28.34 

Political 35.08 ; 
Religious 33.66 29.00 


Theoretical 25.53 34.03 
Economic 33.42 33.68 
Aesthetic 22.61 22.24 
Social 29.97 27.87 
Political 34.73 32.34 
Religious 33.74 29.84 
Theoretical 27.02 29.91 
Economic 31.28 35.29 
Aesthetic 24.89 20.55 
VIII. Social 30.67 28.03 
Political 34.96 32.98 
Religious 31.17 33.22 





‘‘t¢?? values greater than 2.68, indicated by **, show that the possibility 
of the obtained difference being due to chance is less than 1 in 100. 

‘<t’? values greater than 2.01 but less than 2.68, indicated by *, show 
that the possibility of the difference being due to chance is less than 5 
in 100. 


Group I, which includes artist, architect, physician, dentist, 
and psychologist, has the most differentiating profile of any of 
the occupational groups. Four of the Allport-Vernon scores 
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discriminate between subjects having interest in this group and 
subjects not having interest. Subjects with this type of occu- 
pational interest score significantly higher on the theoretical 
and on the aesthetic values, and significantly lower on the 
economie and on the political values. This evidence tends to 
substantiate the prevailing stereotype. ‘The profiles of average 
scores for each of the six values are shown in Figure 1. 
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Fic. 1, Allport-Vernon value profiles of subjects with and without 
Group I interests on Strong Interest Blank. 











All the distributions overlap. For example, on the theoreti- 
cal value scores for occupational Group I, eighteen per cent of 
the subjects showing no interest had scores higher than the 
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median of the subjects with interest. Of the subjects who had 
no vocational interest in Group I, none obtained seéres below 
the median of the subjects with this interest. 

Subjects showing interest in Group II have significantly 
higher scores on the theoretical value and lower scores on the 
political value. This group consists of people showing the in- 
terests of mathematicians, engineers, and chemists. We should 
expect their profiles to be somewhat similar to those of the 
subjects in Group I. 

Especially noteworthy are the scores made on the religious 
scale by subjects with interest in Group IV. This group in- 
cludes ministers, personnel managers, and Y.M.C.A. workers. 
The religious value was the only one of the six which differen- 
tiated significantly between subjects with interest and without 
interest in this occupational grouping. The profiles are pre- 
sented in Figure 2. Again there was considerable over- 
lapping between the two distributions. Twenty-six per cent of 
the subjects with no interest obtained scores above the median 
of the subjects with interest. Twenty-one per cent of the sub- 
jects with interest in these ‘‘uplift’’ occupations obtained 
scores below the median of the subjects with no interest. 

A significant differentiation was made on the theoretical 
value for those people having ‘‘sales’’ interests—Group VII. 
Subjects with interest in this occupational group have lower 
scores on the theoretical scale than people not having this in- 
terest. One aspect of the stereotype of the typical sales person 
tends to be verified here. 

No differentiation appeared in Group III, which includes 
teachers and farmers; in Group V, certified public accountant; 
in Group VI, the business detail occupations; or in Group 
VIII, the linguistic occupations. 

Forty-four of the Strong’s tests were scored for two non- 
occupational interests: masculinity-femininity and occupa- 
tional level. Pearsonian correlations with scores on the Allport- 
Vernon Scale were found to range between -.49 and +.38. 
These two extreme correlations were the only statistically sig- 
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Fic. 2. Allport-Vernon value profiles of subjects with and without 
Group IV interests on Strong Interest Blank. 
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nificant coefficients. Between the theoretical value and the 
masculinity-femininity index the coefficient was .38. Between 
the aesthetic value and the same index, the correlation coeffi- 
cient was —.49. 


CONCLUSIONS 


Relationships have been demonstrated between the values 
measured by the Allport-Vernon Seale and interests as 
measured by the Strong Blank. A few of the occupational 
groups showing measured interest patterns are characterized 
by certain profiles on the Allport-Vernon Scale. Groups may 
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be easily differentiated.’ However, much overlapping exists 
and individual application of the results of this study is 
hazardous. It is possible, nevertheless, in the absence of other 
interest measurement, to use the Allport-Vernon Scale to 
approximate certain occupational interest types as measured 
by the Strong test. Thus, a definite but limited use is demon- 
strated for the Allport-Vernon scores when it is desirable to 
distinguish or identify vocational interest types in the profes- 
sional, sales, or ‘‘uplift’’ occupations. 
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STUDIES IN AUTOMOBILE SPEED ON 
THE HIGHWAY 


I. THE RELATIONSHIP OF CERTAIN FACTORS 
SPEED ON THE OPEN HIGHWAY’ 


C. H. LAWSHE, JR. 


Purdue University 


N utilizing driving speeds as one measure of the behavior 

of automobile drivers on the highway, it is important that 

analyses of speed patterns be made in order that relation- 
ships which exist may be taken into account when subsequent 
interpretations are made. The purpose of this phase of a 
more comprehensive study was to determine what factors and 
driver characteristics, if any, bear relationships to the speeds 
at which people drive automobiles on the highway. 

Experimental Procedure. In collecting data for another 
investigation, open highway speeds of 608 drivers were ob- 
tained by means of the highway speed recorder described by 
Lawshe,? ‘‘open highway speeds’’ being defined as speeds re- 
corded on the highway without the knowledge of the driver at 
a point free from obstructions and obvious hazards, which 
point shall be 800 feet or more from a curve or intersection. 
Three different locations were used and observations were 

1 This study is a portion of a thesis submitted to the Faculty of Purdue 
University in partial fulfillment of the requirements for the degree of 
Doctor of Philosophy, June, 1940. 

Acknowledgment is due Dr. Joseph Tiffin who directed the research, the 
members of the State Highway Commission of Indiana and the officers and 
advisory board members of the Joint Highway Research Project whose 
financial support made the study possible, and Mr. Frank Finney, State 
Commissioner of Motor Vehicles of Indiana who so generously provided 
the data on car owners from his files. 

2 C. H. Lawshe, Jr., ‘‘ Two Devices for Measuring Driving Speed on the 
Highway,’’? American Journal of Psychology, (July, 1940). 
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taken at random between 10:00 A.M. and 5:00 P.M. and on 
every day of the week except Saturday and Sunday. Neither 
the recording equipment nor the investigator could easily be 
seen by the motorist, hence, the assumption is made that the 
attention of the drivers was not distracted by the procedure. 
As the speed of each driver was recorded a notation was made 
of the number of occupants in the ear, the sex of the driver, 
the time of day, the day of the week, and the vehicle license 
number. No data were collected on trucks, and out of state 
cars were listed as such without the license number being 
recorded. All observations were made during the summer 
months and in dry weather. 

Through the cooperation of the Indiana Bureau of Motor 
Vehicles certain information was obtained about the owner 
of each car by means of the vehicle license number. In- 
cluded in these data were the name, address, and sex of the 
owner, and occasionally his age. Make, age, and weight of car 
were also obtained. It should be pointed out at this time that 
there is no assurance that the person driving and the car owner 
are the same person. However, in most phases of the analysis 
cases were excluded when the sex of the owner did not coin- 
cide with the sex of the driver. By eliminating these obvious 
discrepancies it seems logical to assume that for the most part, 
at least, the owners were the drivers. By means of the address 
of the owner it was determined whether he was an urban 
resident or a rural resident, and, in addition, the approximate 
distance of his home from the point of observation was esti- 
mated by means of amap. All of these data were punched on 
sorting machine cards in order to facilitate the statistical 
analysis. 

TIME, LOCATION, AND SPEED 


Problem Locations. As was indicated above, observations 
were made at three different locations,* two of which (Loca- 


3 Descriptions of these locations together with a more complete presenta- 
tion of data will be found in the appendix of a thesis by the author on 
file in the library of Purdue University. 
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tions 2 and 3) were on the same highway at points approxi- 


mately four miles apart. Mean speeds at these two locations 
were 46.1 mph + .54 and 45.6 mph + .74 for 249 and 136 
drivers respectively. At the location (Location 1) on the 
other highway the mean speed was 43.4 mph + .61 for 223 
drivers. Because the difference between the mean speeds at 
Locations 2 and 3 is not statistically significant it was decided 
that in other analyses the two distributions could be combined. 
The mean of the combined distribution (45.9 mph + .44) was 
2.5 mph faster than the mean speed at Location 2 on the other 
highway ; this difference is statistically significant, the critical 
ratio being 3.33. 

Day of Week. Eddy* has reported that speeds observed on 
Sunday were slower than those observed on other days. These 
data did not permit verification of this finding in-as-much as 
no observations were made on either Saturday or Sunday. 
Means of speeds recorded at Location 1 on each of the days, 
Monday, Tuesday, Thursday, and Friday were compared with 
the mean speeds of all other days combined and no significant 
differences were found. Means of speeds recorded at Loca- 
tions 2 and 3 on Monday, Wednesday, Thursday, and Friday 
were compared with speeds at these locations on all other days 
in the same fashion and none of the critical ratios exceeded 
three. However, the mean of the speeds recorded on three 
different Wednesdays at this location was 40.8 mph + .94 for 
do drivers while the mean speed of 168 drivers at this location 
on all other days of the week combined was 44.2 mph + .73. 
This difference of 3.4 mph yields a critical ratio of 2.86, which 
is to say that there are 998 chances in 1000 that the difference 
is a real one and could not have arisen by chance. Since no 
records were obtained at either Location 2 or Location 3 on 
Wednesdays, verification is not possible at those locations 
with the data available. 


*R. C. Eddy, ‘‘Interesting Phases of the Massachusetts Highway Ac- 
cident Survey,’’ Proceedings Institute of Traffic Engineers, V (1934), 


78-82, 
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Time of Day. Mean speeds were computed for seven differ. 
ent hourly periods of the day starting with the 10:00 to 
11:00 A.M. interval and extending to the 4:00 to 5:00 P.M. 
interval. These means ranged from 44.1 mph + 1.37 to 45.8 
mph + 1.37 with the exception of the mean of the 10:00 to 
11:00 A.M. interval which was 42.4 mph + 1.71. All of the 
differences were within the range of normal expectancy for 
sampling errors. 


DRIVER CHARACTERISTICS AND SPEED 


Sex of Driver. Of the 608 speed records that were ob- 
tained, 505 of the drivers were men and 103 were women. 
These men had a mean speed of 45.5 mph + .40 as compared 
to a mean of 42.5 mph + .77 for the women. This difference 
of 3 mph is statistically significant since the critical ratio is 
3.45. Other variables apparently operating will be discussed 
later. 

Age. Considerable attention has been directed to the fact 
that young drivers have a greater proportion of the accidents 
than can be attributed to them by chance.** Since age and 
speed data were both available for 181 drivers, speeds by age 
groups were examined to determine whether or not those age 
groups having the most accidents drive faster (Table I). As 
has previously been mentioned all cases were discarded when 
the sex of the driver and the sex of the owner did not corre- 
spond. It will be noted that while none of the critical ratios 
exceeds 3, there seems to be a distinct tendency (998 chances 
in 1000 that the true difference is greater than zero) for 
drivers in the 40 to 49 age group to drive faster than other 
drivers observed. However, there is no evidence in these data 
to indicate that the tendency for younger drivers to have 
more than their share of accidents can be attributed to the 
speed factor. 


5 ‘Menace of Youthful Drivers,’’ Safety Engineering, CXVIII (July, 
1934), 21. 

6 Motor Vehicle Traffic Conditions in the United States, Part 6, The 
Accident Prone Driver, House Document No. 462. Washington: U. 5. 
Government Printing Office, 1938. 
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TABLE I 
Comparisons of Mean Speeds of Drivers on the Open Highway 
By Age Groups 


8.D. 


Diff. 


Age Group j M | §.D.M. Diff. 


20 to 29 
1.87 

All others 

39 

All others 

10 to 49 

All others 


Over 49 


All others 


OTHER FACTORS AND INTERRELATIONSHIPS 


Place of Residence. While it was impossible to determine 
from some of the addresses supplied whether owners resided 
in cities or in the country, 91 had rural route addresses and 
237 had street addresses. As is indicated in Table II those 
with rural route addresses had a mean speed of 39.9 mph + 
‘97 while those with street addresses had a mean of 45.5 mph -+ 
d3. This difference of 5.5 mph is statistically significant with 
a critical ratio of 5.00. 

In-state residents numbering 406 had a mean speed of 43.6 
mph + .43 while 174 drivers of cars with out-of-state licenses 
had a mean speed of 47.5 mph + .66. This difference like- 
wise is significant with a critical ratio of 4.94. The question 
arises whether residing in-state or out-of-state itself is a factor, 
and whether or not there is a tendency for speed to increase 
as the distance from home increases. To help answer these 
questions drivers were classified according to their place of 
residence within the state by consecutive twenty-five mile zones. 
If speed does inerease with the driver’s distance from home, 
the mean speeds of these groups could be expected gradually to 
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increase. Actually, however, an examination of the data re- 
vealed that while the group residing within 25 miles of the 
point of observation drove significantly slower than the 26 to 
50 mile group, the 51 to 75 mile group, and the group residing 
75 miles or more away, none of these last three differed signifi- 
cantly from each other. That is to say that those drivers who 
reside within 25 miles of the point of observation drove slower 
than those who live more than 25 miles away. This fact is 
presented clearly in Table II where the critical ratio between 
these two groups is shown to be 6.47. Furthermore, this group 
residing more than 25 miles away was compared to the group 
residing out-of-state and no significant difference was found as 
is also indicated in Table II. It seems therefore that residing 
in-state or out-of-state itself is not a factor and that those who 
live more than 25 miles from home may be combined with 
the out-of-state group for all practical purposes. 

Since all observations were made in the country and since 
it has been shown that the rural people and the near residents 
drive slower in comparison to the urban people and the far 
residents, the question naturally arises as to whether both of 
these factors are operating to produce speed differences or 
whether it might not be the same people who are causing these 
trends to appear in the means of both classifications. As is 
indicated in Table II two groups of drivers residing within the 
25 mile Zone were compared, one group being composed en- 
tirely of persons having rural route addresses and the other of 
persons having street addresses. Here the means were found 
to be 38.6 mph + .99 and 43.1 mph + .74 respectively, the 
critical ratio being 3.66. A similar comparison was made be- 
tween rural and urban groups residing more than 25 miles 
away and the critical ratio was found to be 0.64; however, it 
will be noted that one of these groups has an N of only 16 
drivers. Further comparisons in Table II indicate with a 
high degree of probability that rural residents residing within 
the 25 mile zone drive more slowly than do rural residents re- 
siding outside the 25 mile zone and that urban residents re- 


7) 
from 
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TABLE II 


Comparison of the Mean Speeds of Drivers on the Open Highway 
By Residence Groups? 


Residence Groups 


Address 
Urban Address 
All in state 
Out of state 
Within 25 miles 
Over 25 miles (in state) 
Within 25 miles 
Out of state 
Over 25 miles (in state) 


Out of state 


Rural Address 
Within 25 miles 


street 
Within 


Address 

25 miles 

Rural Address 

Over 25 miles and out state | 


Street Address 
Over 25 miles and out state 


Rural Address 
Within 25 miles 


Rural Address 
Over 25 miles and out state | 


Street Address 
Within 25 miles 


Street Address 
Over 25 miles and out state 





’ Mileage figures in this table indicate the distance of the driver’s home 


from the point of observation. 


S.D. | CR. 


| 


5.00 


116 74 


4.5 | 1.01 4.46 


121 70 

















304 C. H. LAWSHE, JR. 


siding within the 25 mile zone drive more slowly than do urban 
residents outside the zone. In other words, the original ob- 
servation seems to obtain except in the case of rural people 
who are more than 25 miles from home; these people appear to 
drive as fast as the urban people also 25 miles or more from 
home. 

Residents and Sex. It will be recalled that a sex difference 
has already been shown. In view of the apparent importance 
of place of residence just shown it seems wise to examine the 
data to determine what interrelationships, if any, exist between 
the sex factor and the residence factors. If it could be shown 
that most of the women were local drivers and most of the men 
were not local drivers the conclusions would have been changed 
materially. Table III shows four comparisons, three of which 
are statistically significant ; that is, local men drove faster than 
local women, local men drove slower than men from more 
distant points, and local women drove slower than women from 
more distant points. However, the mean speed of 285 men 


TABLE III 


Comparison of Mean Speeds of Drivers by Sex and Place of Residence 


Sex Groups N M |S.D.M. 9 C.R. 


Within 25 mi.—male 42.2 .63 





Within 25 mi.—female 9. .90 
Over 25 mi.—male 285 ‘ 51 


Over 25 mi.—female 





Male—within 25 mi. 





Male—over 25 mi. 
Female—within 25 mi. 


Female—over 25 mi. 











residing more than 25 miles away is not significantly different 
from the mean speed of 48 women also residing more than 2) 
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miles away. This fact raises the question as to whether the 
difference between sexes cited earlier is actually a sex differ- 
ence or whether a driving experience factor is operative. In 
other words, if it were known that in general men drive more 
miles than women, that urban people drive more than rural 
people, and that persons farther from home are apt to drive 
more than persons who are less than 25 miles from home, all 
of these significant differences could be explained. However, 
in the absence of annual mileage data, only the hypothesis can 
be advanced at this time. 

Number of Occupants. The mean speed of 204 lone drivers 
was found to be 46.4 mph + .64 while the mean speed of 404 
drivers accompanied by one or more persons is 44.3 mph + .43, 
a difference of 2.1 mph with a critical ratio of 2.73. In other 
words, there are 997 chances in 1000 that this difference is 
ereater than zero and could not have arisen by chance. In the 
absence of data that might explain this difference it can be 
pointed out that if these lone drivers are salesmen and com- 
mercial men, men who drive a great deal in connection with 
their business life, the driving experience hypothesis already 
suggested might be applied. 

Car Data. The age of each driver’s car was correlated 
against his speed as observed on the open highway and an r of 
-.48 + .04 was obtained. In other words, it can be said that 
in general those who have newer cars drove faster than those 
who drove older ones. In like fashion, car weight was corre- 
lated with speed and an r of .19 + .05 was obtained. As also 
would be expected the owners of heavier cars drove slightly 
Taster. 


SUMMARY AND CONCLUSIONS 


As a part of another study, the speeds of 608 drivers on the 
open highway were obtained without their knowledge together 
with the vehicle license number, the number of occupants, the 
sex of the driver, the time of day and the day of the week. 
Through the cooperation of state authorities the license num- 
ber was used to determine the place of residence of the owner, 








306 Cc. H. LAWSHE, JR. 


his sex, age, and the weight and age of his car. The data were 
analyzed after having been punched on sorting machine ¢ards. 

In general, the findings would seem to support the following 
conclusions : 


1. Prevailing speeds varied on the two highways studied. 


2. Speeds on Wednesday were slower than those on any 


other week day at the one location where data were available 

3. There were no differences in speeds by hours of the day 
between 10:00 A.M. and 5:00 P.M. 

4. Drivers between the ages of 40 and 49 drove faster than 
any other group and faster than all other groups combined. 

5. Drivers in the younger age brackets drove no faster than 
did drivers in the older group. 

6. Those drivers having rural route addresses and those 
residing within 25 miles of the place of observation drove more 
slowly respectively than did those with street addresses and 
those who reside more than 25 miles away. 

7. When these rural-urban and near-far groups were ex- 
amined for possible overlappings, the same relationships were 
found to obtain except when drivers residing more than 25 
miles away were segregated into rural and urban groups. The 
difference here was quite small and was not statistically signif- 
cant, thus indicating that those rural people who were farther 
away from home did not differ from urban people insofar as 
speed is concerned. 

8. The women observed drove more slowly than did the men. 
This was true in the case of all secondary comparisons except 
when men and women residing more than 25 miles from the 
place of observation were compared. Here no difference ex- 
isted ; those women who were farther away from home did not 
differ from men in their speed practices. 

9. Lone drivers drove faster than those who were accomi- 
panied by one or more persons. 

10. Drivers of new cars drove faster than did drivers of old 
ears and heavy-car drivers drove faster than drivers of light 
cars. 
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1] 


The hv pothesis is advanced that speed differences ap 


iring between the sex groups, the rural-urban groups, and 


near-far residence Lroups might be explained on the basis 
umount of driving experience, as might the differences exist 


between lone drivers and the drivers who are accompanied 
other occupants. 








Il. APPROACH SPEEDS AND CHANGES IN SIGN SIZE 
AND LOCATION ON THE HIGHWAY’ 


C. H. LAWSHE, JR. 


Purdue Un versity 


N comparing the relative merits of traffic signs on the hie! 
way, whether or not the driver can read or identify the: 
is less important than his response in terms of the behavior 

that is expected or desired. Since numerous signs now in use 
have as one of their functions the reduction of speed at some 
point where higher speeds are thought to be hazardous, it seems 
that the speeds of drivers in the presence of these signs should 
be taken into account in any evaluation of the effectiveness ot 
the signs. The present investigation has as its purpose the 
establishment of a technique utilizing speed measurement in 
the evaluation of certain traffic signs and signing practices on 
the highway. 

Experimental Procedure. The multiple-speed recorder de 
scribed by Lawshe' was employed to measure the speeds of 
motorists without their knowledge as they approached an inter 
section but at a point where preliminary investigation had 
shown that deceleration for the intersection had not vet begun 
Likewise, the speeds of these same drivers were also recorded 
nearer the intersection at a point on the highway where the 
driver had begun to reduce his speed. These two points at 
which speeds were recorded were approximately 1100 feet and 


ne 
tie 


400 feet respectively from the center line of the intersect 
highway (1000 feet and 300 feet from the sien) in the loca 


tions? discussed in this paper. By introducing variations 


1C, H. Lawshe, Jr., ‘‘ Two Devices for Measuring Driving Speed 


Highway,’’ American Journal of Psychology, (July, 1940). 

2 Details concerning problem locations, sign arrangement, and 
points at which speeds were observed are presented in the appendix of t 
author’s thesis, Psychological Studies of Some Factors Related to D 
Speed on the Highway, which is on file in the library of Purdue Univ 


SOS 
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1 size or location and matching drivers on the basis of their 
eeds at the 1100-foot position it was possible to compare 
ean speeds as they approached the intersection. At the same 
e that speeds were recorded the investigator noted the num- 

of occupants, the sex of the driver, and the license number 
the car. From the license number* the place of residence 
the owner was determined. All information was then coded 


sorting machine cards in order to facilitate the statistical 











Photograph taken at the first location showing the standard 


‘stop’? sign at the intersection, 


cof Sign. At the first location discussed here the speeds 
f approximately 100 drivers were recorded as indicated above 
a standard 24-inch ‘‘stop’’ sign (Fig. 1) at the intersec- 
This sign was then replaced with a 4-foot ‘‘stop”’ sign 

2) and another 100 observations were taken. 
the manner described above 77 drivers who were observed 


the small sign was in position were matched against 77 


ers who were observed when the large sign was in position, 


atching being done on the basis of speeds at the 1100-foot 


se data were supplied through the courtesy of Mr. Frank Finney, 
ommissioner of Motor Vehicles of Indiana. 
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Fic. 2. Photograph taken at the same intersection shown in F 


after the 4-foot stop sign had been installed. 


position. As is shown in Table I the respective mean speeds 
of the two matched groups at the 400-foot position were 32.!) 
mph and 33.4 mph, a difference of 0.5 mph. 

Table I further shows similar comparisons made with the 
following secondary groups: those who reside within 25 miles 


of the problem location, those who reside more than 25 mil 


eS 


from the problem location, and those whose addresses indicate 


TABLE I 





Comparison of Mean Speeds Sor Feet} from Two Sizes of Stop Sidr 
Groups Matched on Basis of Speed 1000 Feet Away 


Mean Speeds 300 

Mean Feet from Sign 
Speed 

at 1000 Small Large 

Feet Stop Stop 

Sign Sign 


Group N 


All 3. 32.9 
Reside within 25 miles : 5 30.6 
Reside farther than 25 miles , 5. 35.2 


Those with street addresses 33.6 20.7 


1 While measurements were taken 1100 feet and 400 feet from t 
tersection they were 1000 feet and 300 feet from the ‘‘stop’’ signs 
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at they reside in a city. In no instance is there a significant 
ference in speed at the near position. Hence, it is apparent 
it in this experiment no changes in speed at the position 
ivest the sign resulted from the change in the size of the sien 
In presenting these facts it is not the intention of the investi 
tor to imply that the large ‘* 


stop”’ 
in the small one. 


sign is of no more value 
It is possible, while the large sign does not 
se drivers to slow down sooner, that it may produce sooner 
the driver a ‘‘state of readiness’” whereby 


potential accel 
its can more nearly be averted. 


Consideration should also 
viven to the very small percentage of drivers who might not 


the small sign in time to stop but who might see the large 
soon enough. 


Group Comparisons. Since the speed patterns of drivers 
not differ in the two sign situations, all data were combined 
der to permit the making of certain group comparisons. 
se comparisons were made between 30 men and 30 women 
atched on the basis of their speeds at 1100 feet. 


Table I] 


ws that their respective mean speeds at the 400-foot posi- 


TABLE II 


son of the Mean Speeds at. 


00 Feet of Various Gro 
Vatched on the Basis of Speed at ) Feet 


Mean 
, Mean 
Speed 
Group ; i000 Approacl 


Feet 7 


oute address 
address 

25 miles 

in 25 mile 
ipant 


ian one 
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tion were 33.2 mph and 33.4 mph, a difference of 0.2 my 
Table II also shows the comparisons that were made betwe 
those with rural route addresses and those with street 

dresses, those who lived within 25 miles of the location, a 
those who reside more than 25 miles from there, and thi 
drivers who were alone with those who were accompanied 
one or more persons, There were no significant differences 
the mean approach speeds of any of the groups examined. 


Sign Position. The investigation at the second location « 


cussed in this paper emploved the same general techniques a 


did the first. However, while the study was made at an inte 
section, the state road which carried the traffic that was being 
studied made a right turn at the intersection. Approximat 
300 feet back of the standard 24-inch ‘‘stop’’ sign which wa 
located at the intersection was a standard arrow type ‘‘turn”’ 
sign (Fig. 3). Approximately 100 records were made wit 
this arrangement. Later the same ‘‘turn’’ sien (Fie. 4) wa 
moved 100 feet farther away from the intersection in suc! 
fashion that the motorist would pass it sooner. Approximately 
100 more records were made at identically the same points on 
the highway as before. 

From the data obtained in this fashion 86 drivers observed 
with the sign in the near position were matched with 86 driver 
observed when the sign was in the far position, the matching 
being done on the basis of speed at the 1100-foot position. The 
mean speeds of the two groups at the 400-foot position are 
shown in Table IIT to be 37.0 mph and 36.2 mph, respectively. 
a difference of 0.8 mph. Similarly, as is also indicated in 
Table III, secondary comparisons were made with the follow 
ing groups: those who reside within 25 miles of the location, 
those who reside more than 25 miles from the location, and 
those having street addresses. Lindquist’s* technique for test 
ing the significance of a difference between matched groups 
was employed and only the group composed of drivers having 

4#E. F. Lindquist, ‘‘The Significance of a Difference between Mat 
Groups,’’ Journal of Educational Psychology, XXII, (March, 1931 


192-204, 
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3. Photograph taken at the second location showing the ‘‘turn’’ 


vhich was approximately 300 feet from the stop sign. 


street addresses showed a sienificant difference between their 


ean speeds at the 400-foot location when the position of the 


sign was changed. With this group, when the sign was closest 


to the intersection the mean was 38.8 mph at the 400-foot 











Photograph taken later at the second location showing the 
* sign after it had been moved back to a position approximately 


t from the ‘‘stop’’ sign. 
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TABLE III 


Comparison of Mean Speeds at 300 Feet with ‘* Turn’? Sian in Two ] 
tions for Groups Vatched on Basis of Speed at 1000 Feet 


Mean Mean Speed at 
Speed 300 Feet 
1000 


Feet 


Group 

Near 

Sign 

All ) $5.5 37.0 

Reside within 25 
miles 


Reside farther than 


25 miles 


Those with street 
address > $6.1 


point, and when it was moved back the mean was 35.1; this 
difference of 3.7 vields a critical ratio of 3.52 by the method 
cited above. In other words, it appears that when the sign was 
moved back (away from the intersection) those drivers com 
prising this particular group, since they were going mor 
slowly when they reached the 400-foot mark, began to slow 
down sooner. The hypothesis advanced here is similar to that 


suggested in another paper,’ namely, that perhaps the drivers 





making up this group are business and commercial people who 
drive a great deal and who have developed habits of respond 
ing to signs in a more positive and less random fashion than the 
average driver. 

Group Comparisons. Since statistically significant differ 
ences in conjunction with change in sign position were found 
with only 27 pairs of drivers, all of the data were combined fo! 


the purpose of making group comparisons as was done with the 


‘*stop’’ sign data. As is presented in Table IV, 43 men wer 

matched with 43 women on the basis of their speeds at thi 

1100-foot position and their respective means at the 400-foot 
5(. H. Lawshe, ‘‘The Relationship of Certain Factors to Speed 


> 


Open Highway,’’ Journal of Applied Psychology, XXIV, No. 3, 
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TABLE IV 


Comparison of the Approach Speeds of Groups Matched on the Basis of 
Open Highway Speeds 
Mean 
, Open 
N Highway 
Speed 


Mean 
Approach 
Speed 


36.5 


43.3 
Women 34.8 





Rural route address 
Street address 
Within 25 miles 
More than 25 miles 


One occupant 38.7 


More than one | | 37.1 


position were found to be 36.5 mph and 34.8 mph, a difference 
of 1.7 mph which is not statistically significant by Lindquist’s 
method. Similar examination indicated that those who live 
more than 25 miles away slowed down less than those who re- 
side within 25 miles of the location, that lone drivers slowed 
down less than those who were accompanied by other occu- 
pants, and that persons with street addresses slowed down less 
than those with rural addresses. Only in the last instance, 
however, was the difference statistically significant. Here the 
difference of 3.2 mph yielded a critical ratio of 2.34. When 
expressed in terms of probability it can be said that there are 
approximately 99 chances in 100 that this difference is a real 
one and could not likely have arisen through chance. 

That is, two matched groups of drivers were traveling at 
the same rate of speed 1100 feet from the intersection ; those 
with rural route addresses had reduced their mean speed to 
34.3 mph by the time they had reached the 400-foot mark while 
those with street addresses had only slowed down to a mean 
speed of 37.5 mph. 
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The hypothesis is suggested that those persons who have 
rural route addresses are local residents who are familiar with 
the intersection and who take their cues for slowing down from 
the entire configuration as they begin to approach the corner. 
On the other hand, members of this urban group are less fa- 
miliar with the intersection and depend more explicitly upon 
the sign. That they are probably more responsive to the posi- 


tion of the sign than any other group has already been sug- 
gested above. 
SUMMARY AND CONCLUSIONS 


The speeds of approximately 400 drivers were recorded at 
points 1100 feet and 400 feet from one of two highway inter- 
sections. At one location first a standard 24-inch ‘“‘stop’’ sign 
was used and then a 4-foot ‘‘stop’”’ sign. At the second loca- 
tion a standard arrow type ‘‘turn’’ sign was located about 300 
feet from the intersection and then was moved to a position 
approximately 400 feet from the intersection. The speeds of 
about 100 drivers were recorded in the presence of each of the 
four sign arrangements. 

Driver response to these variations in sign size and sign loca- 
tion was studied by matching groups on the basis of driver 
speed at the 1100-foot position and by comparing the mean 
speeds of the drivers at the 400-foot position. Comparisons 
between various groups were made in the same fashion. 

The results seem to warrant the following conclusions: 

1. No variations in approach speeds were found with any 
group when the size of the stop sign was changed. 

2. When all data collected at the first location were com- 
bined and comparisons made between groups, no differences in 
approach speeds were found. 

3. It is pointed out that potential values resulting from in- 
creasing the size of a ‘‘stop’’ sign are not necessarily confined 
to the speed aspect. There may be a ‘‘readiness’’ factor which, 
while not apparent in the driver’s speed, is important from the 
safety point of view. 

4. When the ‘‘turn’’ sign was moved farther away from the 
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intersection, differences in the approach speeds of the matched 
eroups were found only in the case of those drivers who have 
street addresses. Here it was found that when the sign was 
farther away the mean speed was slower at the 400-foot point 
than it was when the sign occupied the nearer position. 
Hence, it appears that those with street addresses were more 
sensitive to the location of the sign than any other group. 


5. When all of the data collected at the second location were 
combined, comparison of mean approach speeds of matched 
eroups indicated that men slowed down less than women, that 
near residents slowed down less than those from farther away, 
lone drivers slowed down less than those who had companions, 
and those who have street addresses slowed down less than 
those who have rural addresses. Only in the case of this last 
comparison, however, was the difference statistically signifi- 
cant. 

6. The hypothesis is advanced that the rural drivers, being 
mostly local residents, are more familiar with the location 
which was studied and that they tend to respond to the entire 
configuration rather than to the specific ‘‘turn’’ sign as do 
those with urban addresses who are less familiar with the inter- 
section and who are habituated to depending upon signs. 








Ill. SOME DRIVER OPINIONS AND THEIR 
RELATIONSHIP TO SPEED ON THE 
OPEN HIGHWAY 


C. H. LAWSHE, JR. 


Purdue University 


HE question of the functional operation of attitudes as 
expressed through various scales and measuring devices 
is one that is frequently raised but rarely examined on 

an experimental basis. The present investigation permitted 
such an examination and has as its purpose the comparison of 
the opinions or expressed attitudes of automobile drivers to 
the actual behavior of these same drivers on the open highway. 


EXPERIMENTAL PROCEDURE 


Collection of data. By means of the highway speed recorder 
described by Lawshe' and according to the method discussed 
in another paper,” data including speed on the open highway, 
sex, age, and place of residence of owners, together with the 
number of occupants in each car, were obtained for 297 resi- 
dents of the state of Indiana. 

The Questionnaire. A brief questionnaire was designed 
and printed on double post cards. The portion addressed to 
the car owner explained that Purdue University was inter- 
ested in finding out what drivers think about certain prob- 
lems related to highway safety and a plea was made for the 
cooperation of the recipient. Car owners were asked not to 
sign their names; however, return addresses were coded in 
such a manner. that, while it was impossible for the driver to 

10, H. Lawshe, Jr., ‘‘Two Devices for Measuring Driving Speed on 
the Highway,’’ American Journal of Psychology, (July, 1940). 

2C. H. Lawshe, Jr., ‘‘The Relationship of Certain Factors to Speed on 
the Open Highway,’’ Journal of Applied Psychology, XXIV, No. 3, 194. 
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suspect it, returns could be matched with the data already 
available. Three questions were asked. The first was: 

What is the fastest speed at which a motorist may 
travel with safety on the open highway, on a good 
concrete or blacktop road, in the daylight and in dry 
weather? 


The second was as follows: 


Check below one cause which you believe is responsible for 


most highway accidents: 
Driving too fast 
—— Failing to signal to other drivers 
Driving on wrong side of road 
—— Passing when unsafe 
——— Disregarding signs and signals 
Other causes 
The first five of the accident causes just listed were adapted 
from a list of ‘‘improper acts’’ cited by Williams.* The third 
question pertained to the age of the recipient and returns were 
utilized to augment the incomplete age data which had been 
compiled from another source. Questionnaires were mailed 
only to those owners whose sex corresponded to the sex of the 
drivers observed, thus eliminating a portion of the cases in 
which owners and drivers were not the same persons. 


THE RESPONDING SAMPLE 


Respondents and Non-Respondents. Of the 297 to whom 
cards were mailed, 107 or 36 per cent responded. The ques- 
tion naturally arises as to whether those who answered differ 
materially from those who did not answer in so far as the data 
available are concerned. Examination of the observed speeds 
of both groups indicated that the mean speed of the respond- 
ing group was 44.6 mph + .73 as compared to 43.0 mph + .65 
for the non-respondents, a difference which is not statistically 
significant. 

8. J. Williams, ‘‘ Accidents on the Road,’’ Public Roads, X1X (1938), 


i-83, 
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Further examination (Table I) revealed that a greater per 


centage of men responded than women, a greater percentage 


TABLE I 


Percentages of Various Groups that Responded to Questionnair: 


— Number | Number | % SD.o 
I Cireularized | Responded Responded | ~*~’ “ 
All drivers | 297 | 107 | 360 | 28 
Driver’s sex | 
Male 272 101 37.1 2.9 
Female 25 6 24.0 8.5 
Number occupants 
One | 111 47 42.3 4.7 
Two or more 186 60 | 32.3 | 3.4 
| 
Residence 
Within 25 mi. 181 61 33.7 3.5 
Over 25 mi. 116 46 39.6 4.5 
Rural Route | 69 20 29.0 | 5.5 
Street Address | 


168 | 68 40.5 3.8 





of lone drivers answered as compared to those who were accom- 
panied, more of those living over 25 miles away sent returns 
as compared to those who reside within a 25-mile radius of the 
point of observation, and a greater percentage of those having 
street addresses as compared to those having rural route 
addressés responded by mailing their return cards. However, 
none of these differences is statistically significant so it appears 
that for the purposes of this study it may be assumed that the 
respondents constitute an unselected sample from the popula- 
tion which was circularized. 


OPINION OF MAXIMUM SAFE SFEEDS 


Opinion and Behavior. Of the 107 respondents 104 gave an 
expression of opinion of maximum safe speed in response to 
the first question already described. It will be noted that the 
question called, not for the person’s usual driving speed, but 
for his opinion of ‘‘maximum safe speed.’’ The coefficient of 
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correlation between opinion of maximum safe speed and actual 
speed observed on the open highway was found to be .46 + .08. 
This value yields a coefficient of determination of .21; in other 
words, if a causal relationship were assumed to exist it could 
be said that 21 per cent of the variation in driving speed could 
be attributed to variation in attitude. 

Do drivers exceed their own maxima? The data indicate 
that 6 of the 104 at the time and point of observation exceeded 
their own speed limits and that only 3 exceeded their limits by 
more than 3 mph. 

Character of the Distribution. Since a large portion of the 


respondents gave either 50 mph or 60 mph as the maximum 
speed for safety, the distribution of opinions was distinctly 


bi-modal. Because of this fact all who specified speeds between 
those two limits were discarded and the remaining two groups, 
those who specified 50 mph or less and those who specified 60 
mph or more, were compared in order to determine what differ- 
ences, if any, obtain. 

Even though bi-modality of the distribution was quite ap- 
parent, Pearson’s test of goodness of fit was applied through 
the use of chi-square and it was determined that the deviation 
from the normal probability curve was so great that it could 
not have arisen by chance. The question naturally arose as to 
what effect, if any, this deviation from the normal curve would 
have upon the coefficient of correlation, .46, expressed above. 
Hence the correlation ratio or eta was computed from the 
means of the arrays and found to be .54. This value of the 
‘‘raw’’ eta when corrected for errors introduced by the proc- 
ess of grouping data yielded .45 which is virtually the same as 
the r value of .46 originally presented. Since eta represents 
the maximum correlation that can be obtained, it is apparent 
that the bi-modal nature of the distribution of opinions has not 
resulted in a spurious coefficient of correlation. 

Characteristics of Groups. Means of the observed speeds of 
the extreme groups were found to be 47.6 mph + 1.03 for the 
group which considers 60 mph or faster as a safe driving speed 
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and 40.8 mph + .98 for the remaining drivers. This differ- 
ence of 6.8 mph is statistically significant with a critical ratio 
of 4.79. 

Other comparisons are made in Table II. It will be noted 
that while there is a greater percentage of men than women in 
the 60 mph or faster group, the difference is not significant. 


TABLE II 


Comparison of Percentages of Various Groups Who Gave 60 MPH or Mor 
as Maximum Safe Speed 























Number SD SD 
Group man 60 or “i % Diff. | Hig | C-R. 
ota over 7/0 
SEX | ; 
Male 86 47 | 54.6| 5.4 | 
| 21.3 | 19.9 | 1.07 
Female 6 | 2 33.3 | 19.2 | | 
| | } 
RESIDENCE | | | 
Within 25 mi. 55 | 6380) | «54.5 | 6.7 | 
| | 3.2 | 10.6 | 0.30 
Over 25 mi. | 37 | 19 | 513] 82 | 
| | 
Rural route | 48 | 3 |3167) 83 | | 
| | 45.3 | 10.9 | 4.16 
Street address 58 36 | 62.0 | 6.4 | 
| 
OPINION OF CAUSE | | 
‘*Driving too fast’’ 19 5 | 26.3 | 10.1 | 
| | | 31.5 | 11.8 | 2.67 
All others 64 | 37 | 57.8] 6.2 | 
| | 
| | | 
‘*Passing when un- | | 
safe’’ 30 | 20 66.6 8.6 | 
| | 25.1 | 11.0 | 2.28 


All others 53 22 41.5 6.8 





4 Number does not include those who expressed a maximum safe speed 
between 50 and 60 mph. 


Neither is there a significant difference between the percentage 
of persons residing within a 25-mile radius in the fast group as 
compared to the percentage of persons residing more than 20 
miles from the point of observation. However, while 62 per 
cent of the persons with street addresses gave 60 mph or more 
as a maximum speed for safety, only 16.7 per cent of the per- 
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sons with rural route addresses are included in the group. 
The difference between these percentages is significant, the 
eritical ratio being 4.16. 

The mean age of the group expressing the faster maximum 
was found to be 45 years + 2 as compared to 42 years + 2 for 
the other group. In comparing the car ages of the two groups, 
it was found that the median age of car of the faster group is 
1 year as compared to a median age of 4 years for the other. 


OPINION OF ACCIDENT CAUSATION 


Distribution of Responses. The possible responses on the 
second question previously listed with the exception of ‘‘ Other 
causes’’ were rotated in printing the cards to minimize any 
effect that position of the item might have upon its selection 
by the respondent; in other words, ‘‘Driving too fast’’ ap- 
peared in first place in a fifth of the questionnaires, in second 


place on a fifth, ete. Each item appeared in each position an 
equal number of times with the exception of ‘‘Other causes’’ 
which was always listed last. The 107 respondents checked 
causes with the following frequencies: ‘‘Passing when un- 


37 ; ‘‘ Disregarding signs and signals,’’ 27 ; ‘‘ Driving too 


sare, 
fast,’’ 21; ‘‘Failing to signal to other drivers,’’ 8; ‘‘Other 
causes,’’? 4; ‘‘Driving on wrong side of road,’’ 0; no answer, 
more than one answer, or otherwise invalid, 10. 

Those who checked ‘‘ Driving too fast’’ were found to have 
a mean observed speed of 40.6 mph + 1.71 while all others had 
a mean of 45.6 mph + .76. This difference of 5 mph yields a 
critical ratio of 2.67 which is to say that there are 996 chances 
in 1000 that the true difference is greater than zero and that 
the obtained difference could not have arisen by chance. As is 
shown in Table II there was a tendency (996 chances in 1000 
that the difference is greater than zero) for those who checked 
a cause other than ‘‘ Driving too fast’’ to indicate a safe maxi- 
mum speed of 60 mph or more. Likewise, there was a tendency 
(989 chances in 1000 that the difference is greater than zero) 
for those who marked ‘‘ Passing when unsafe’’ to indicate 60 
mph or more as maximum safe speed. 
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SUMMARY AND CONCLUSIONS 


Questionnaires were mailed to 297 automobile owners whose 


ear speeds on the highway had been previously recorded and 
about whom various types of information had been gathered, 
The subjects were requested to indicate what they believed 
to be a maximum safe speed under favorable conditions and to 
check what they believed to be the major cause of highway 
accidents. Replies were received from 107 or 36 per cent ot 
the recipients. 

The results warrant the following conclusions: 

1. The responding sample was representative of the group 
cireularized in so far as observed speed, sex distribution, and 
distribution by place of residence and rural-urban residence 
are concerned. 

2. There is a significant relationship between the maximun 
safe speed as expressed on the questionnaire and the speed 
observed on the highway, the coefficient of correlation being 
.46 + .08. In general, those who expressed higher speed max- 
ima drove faster, but relatively few drivers (6 out of 104 
exceeded their own maxima. 

3. There was a distinct tendency for persons with street 
addresses to express 60 mph or more as a safe maximum and 
for persons with rural route addresses to express 50 mph or 
less as a safe maximum. 

4. In general, persons expressing higher speed maxima 


drove newer cars. 

5. The persons who responded believe that ‘‘ Passing when 
unsafe’’ is responsible for most highway accidents. 

6. Those who checked ‘‘ Driving too fast’’ drove more slow] 
than those who checked other causes. 








\ TEST-INTERVIEW FOR DELINQUENT 
CHILDREN’ 


RALPH M. STOGDILL 


Bureau of Juvenile Research, Columbus, Ohio 


ARIOUS types of tests have been devised for measuring 
delinquent tendencies. Among these are tests of cheat- 
ing, exaggeration and dishonesty. There are also avail- 

able numerous seales and questionnaires which have been found 
to dise riminate-wel] etween the attitudes,interests, and emo- 
tional and personality traits of delinquent and non-delinquent 
children. Since the most widely used of these devices have 
been reviewed by Symonds (4) it will not be necessary to de- 
scribe them here. However, it should be mentioned that one 
factor which seems to be common to most of these tests is the 


indirect approach to the child. That is, the central purpose of 
the test is disguised so that the subject is not made aware of the 
fact that he is being tested for delinquent trends. This seem- 
ngly distrustful attitude toward children may be of some value 
in gathering research data, but does not reveal anything about 





+} 


e child’s actual delinquencies or aid-the child in gaining 
insight. 

At the Bureau of Juvenile Research we attempt to make a 
straightforward approach.to the child’s difficulties. In accord 
with this principle we have developed a test-interview in which 
the child is questioned directly about his behavior. We have 


been interested, not so much in devising a research instrument 


lor measuring delinquent tendencies, as in preparing a stand- 
— interview which would enable the childto—reveal.as 


uuch of his delinquent past as he might be willing to admit. 


This investigation was conducted with the assistance of Miss Ruth 


n, Psychological Interne at the Bureau of Juvenile Research. 
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The results which we have obtained indicate that we were not 
over-optimistic in believing that children would cooperate 
whole-heartedly when asked to take such a test. 

In developing the test-interview, which we haye named the 
“Behavior Cards,’’ we have attempted to keep several practi- 
cal objectives in view. It was decided that the tegt should be 
easy, uninvolved and mechanically simple for the benefit of the 
subjects, and xtensive enoug! over a comprehensive sam- 
pling of delinquent behavior.and experiences. The test items 
should be stated in terms which would be readily compre- 
hended by, but would be inoffensive to, children. Most impor- 
tant, the general form, method-and Character of the test should 
be of a sort which would tend _to reduce rather than increase the 
child’s emotignal-tensions. 

As has been mentioned, the test is mechanically simple and 
uninvolved. Following a scheme employed by Maller (2), each 


item is printed on a separate eard. The card is dropped-by the 





subject into one of two boxes (a “‘yes’’ box and a ‘‘no’’ box) to 
indicate his response. After the complete set has been sorted 
into the two boxes, the ‘‘yes’’ responses-aré éhecked on a tabu- 
lating sheet and the cards-are resorted and placed in numerical 
order in the pack. Thus the subject’s responses are in a fashion 
obliterated, much as if he had made ‘a drawing in the sand 
which could be readily erased. It seems evident that this 
scheme possesses an important advantage over the paper-pencil 
test in which the child feels that he 1s writing a permanent ree- 
ord of his short¢omings. This factor may be, in part, respon- 
sible for the high validity coefficients obtained with the 
Behavior Cards. 

The Behavior Cards, in the complete form, consisted of 189 
items. However, only 150 of these items were given to both 
groups studied in this investigation. In testing, the subject 
was seated at a table alone and was given instructions for 
sorting the cards into the two boxes. 

The experimental group consisted of 100 delinquent boys 


studied at the Bureau of Juvenile Research. Approximately 
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75 per cent of them were committed to the Bureau by juvenile 
courts, while the remaining 25 per cent were committed by 
parents, social agencies, children’ s homes and state institutions. 
Their average Stganford.Binet intelligence quotient was_95.6 
There was only one feebleminded boy in the group. There 
were perhaps a few more cases of gex delinquency than would 
be found among unselected delinquent boys. 

The control group, which was studied by Miss Pushin (3), 
consisted of fifty boys in the seventh and eighth grades of a 
junior high school, located in an urban community which is 
regarded as slightly below average in socio-economic status. 
The average Barr rating of father’s occupation was 7.55 which 
approximates that of railroad firemen. These children were 
tested individually at the school building. The same testing 
procedure was employed as for the delinquents, except that the 
items referring to sex offenses were removed when adminis- 
tering the Behavior Cards to the non-delinquent boys. 

The two groups were closely matched as to chronological age, 
but not as to grade placement or intelligence test scores. The 


average chromelogical age of the delinquents was 14.6 years, 
while that of the school boys was 14.4 years. The average men- 
tal age score on the Ohio Literac y Test was 12.6 for the delin- 
quents and_13.5 for the school boys. The a~erage grade pl: ace- 
ment for the delinquent boys was 7.0, and tuat for the normal 
boys 8.5. 
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When we compare the two groups of boys according to their 


total scores on the Behavior Cards we find that they are sharply 
differentiated. The ayerage score f for the delinquent boys is 
41.6 (with a standard deviation. of 17,1). The average for the 
public school boys is_24.8 (with a standard deviation of 15.4). 
The critical ratio of this difference is 6.1. This suggests that a 
well established difference is to be found between these two 
groups when they are compared as to the delinquent activities 
which they are willing to admit. 

When the 100 delinquent boys are classified as to the major 
offense for which they were committed to the Bureau, we ob- 
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serve some interesting differences in their scores on the Be- 
havior Cards. Those boys who were committed for stealing, 
truancies and homosexual offenses make scores somewhat above 
the average for the group as a whole. Those who were charged 
with heterosexual offenses make scores somewhat below the 
group average, and those boys who had committed assaults 
made scores twenty points below the average of the total group 
We believe that these scores represent valid differences in per- 
sonality adjustment and delinquent trend. 

When the delinquent boys were classified as to whether they 
were ‘‘Stable’’ or ‘‘ Unstable’’ in personality development, the 
following results were obtained : The average score of the stable 
boys was 36.9, while that of the unstable boys was 45.2. Boys 
who were regarded by the psychologist as having a Good prog- 
nosis for satisfactory future adjustment made average scores of 
34.1, while those with a Poor prognosis had average scores of 
49.4. That is, the unstable boys and those with poor prognoses 
placed a significantly greater number of cards in the ‘‘yes’’ 
box. When the delinquent boys were classified as to intelli- 
gence and home background, no marked differences were found. 
The correlation between Stanford-Binet IQ and score on the 
Behavior Cards was—.02. ~ 

Thus far we have considered only those differences which 
exist when groups are compared as to total scores. Still more 
striking differences are obtained when the 100 delinquent boys 
are compared with the fifty normal boys as to their responses 
to the individual items of the Behavior Cards. Only two per 
cent of the normal boys admit having been in Juvenile Court, 
while 77 per cent of the delinquents say they have been. (The 
critical ratio of this difference is 15.6.) All items relating to 
stealing distinguish significantly between the two groups. ‘Ten 
per cent of the normal boys admit frequent thefts, while 53 per 
cent of the delinquent boys admit having stolen often. (The 
critical ratio is 6.5.) Two per cent of the normal boys, as com- 
pared with 59 per cent of the delinquents, admit stealing 
money. (The critical ratio is 10.8.) 
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All the items regarding truancy from school and the major- 
ity of items referring to running away from home distinguish 
significantly between the two groups. Eight per cent of the 
normal boys as compared with 51 per cent of the delinquents, 
admit having run away from home. (The critical ratio is 
7.0.) 

We do not know how many of the public school boys might 
have answered ‘‘yes’’ to the questions regarding sex experi- 
ences, since those cards were omitted when the test was given to 
them. Fourteen per cent of the delinquent boys admit homo- 
sexual experiences, while 20 per cent of them admit experiences 
with the opposite sex. Although 21 per cent of these boys 
admit fear of insanity or loss of health (which is usually asso- 
ciated with guilt feelings regarding masturbation), only two 
per cent of these boys admit excessive masturbation. 

Items referring to bad companions discriminate well between 
the two groups. Twelve per cent of the school boys answer 

s’’ to the question, ‘‘Do you go around with some boys and 
girls who get into lots of trouble?’’ Forty-one per cent of the 
delinquent boys answer ‘‘yes’’ to this question. (The critical 
ratio is 4.4.) 

All items concerning deep seated worries, fears, and compul- 
sions distinguish well between the two groups. Ten per cent 
of the normal boys, as compared with 29 per cent of the delin- 
quents, answer ‘‘yes’’ to the question. ‘‘Is there something 
terrible that you worry about?’’ There is evidence here which 
suggests that the delinquent child is an unhappy child with 
many fears and worries. The correlation coefficient of .76 
between the Maller Persenality Behavior 

Cards would seem to support this. ca a 

Few of the items regarding the child’s relationship to his 
parents discriminate well between the two groups of boys. The 
only items which do discriminate well are those referring to 
beatings and over-severe punishment. Only two per cent of 
the normal boys answer ‘‘yes’’ to the question, ‘‘Do your par- 
ents beat you?’’ Twenty-two per cent of the delinquents 
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answer ‘‘yes’’ to this question. (The critical ratio is 42 


Items referring to desertion by parents are fairly discriminat- 
ing. However, those items regarding overprotection by par- 
ents do not differentiate between the two groups. These find- 
ings are in accord with those of Fitz-Simons (1) and other 
investigators who have found that delinquent children are 
likely to come from homes where the parents are rejectful in 
their attitudes. 

We should not fail to mention many of the similarities be- 
tween the normal and delinquent boys. Both groups are 
highly sociable, liking to play with other boys and girls. Both 
groups admit that they have high tempers, fight, swear, disobey 
and become angry easily. Approximately 70 per cent of both 
groups have to work for their spending money. Over 40 per 
cent of both groups admit that they argue with their parents 
when told to do things about the home. These findings would 
seem to indicate that the delinquent.boyis not. a distinct type 
or.species. He merely presents, in a morecxtromedegree, the 
same problems which are experienced by normal boys. 

There are a few items on which the normal boys make higher 
scores than the delinquents. These refer mostly to many com- 
panions, fighting and high temper. It seems significant to us 
that 46 per cent of the normal boys, as compared_with 36 per 
cent of the delinquents, answer ‘‘yes’’ to the question, “Is there 
somebody you Would like to beat up on?’’ 

Internal consistency values were computed for each item, 
and indicate that the boys who make the highest total scores on 
the test are the ones who are likely to answer ‘‘yes’’ to items 
regarding truancies, lying, fighting, stealing, disobedience, sex 
offenses, bad companions, fears and court appearances. 

The reliability coefficients on the Behavior Sketches are high 
enough to be regarded as satisfactory. The odd-even reliabil- 
ity (uncorrected) for the delinquent group is r = .86, while that 
for the normal group is r=.88. When corrected for attenua- 
tion these become r=.924 for delinquents, and r= .936 for 
normals. 
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Miss Pushin (2) did all the work involved in validating the 
items in the Behavior Cards. She made a careful analysis of 
each boy’s social history in order to determine which delin- 
quencies were actually mentioned there. The boy’s answer to 
each question in the Behavior Cards was then compared with 
the information contained in the social history, and the infor- 
mation from both sources was plotted on four-fold tables. 

That is, the boy’s answer, ‘‘yes”’ or ‘‘no,’’ to a given question 
regarding delinquent behavior was plotted against the mention 
or failure to mention such delinquency in the social history. ) 
The correlation coefficients were then computed with the aid of 
Thurstone’s (5) tables. These correlations were computed 
only upon the data for the delinquent boys, since we had no 
ease histories on the public school boys. 

The average validity coefficients for items relating to juvenile 
court appearances, truancy from school, swearing and hetero- 
sexual experiences are above .80; while the average coefficients 
for items concerning stealing, lying, bad companions, and mis- 
treatment by parents are above .70. Those for homosexual 
experiences, fighting and assaults are above .60, and those for 
truancy from home are above .50. The average validity coeffi- 
cient of items referring to disobedience is .36, that for setting 
fires is .10, and that for complaining of being picked on by 
other children is —.56. It was not possible to obtain adequate 
validating information from the social histories regarding 
abnormal fears and other such subjective factors. 

The magnitude of these correlations and the large number of 
boys who admitted yarious-effenses and difficulties not men- 
tioned in the social histories lead us to wonder which we have 
validated—the boys’ responses _or the social histories! At 
least, these coefficients indicate that the delinquent boys re- 
ported with a high degree of accuracy their offenses relating to 
stealing, truancy from home and Sehool, lying, Séx €xperiences, 
swearing, fighting, running with bad companions, and commit- 
ment to the Juvenile Court. They do not make reliable reports 
regarding setting fires or excessive masturbation. Their claims 
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that other children pick on them are negatively correlated with 
the facts as determined by the social history. This is in accord 
with our observation that children who are most frequently 
reported by the cottage supervisors for quarreling and annoy- 
ing other children are the ones who complain that-others pick 
on them. 

In summary we may say that the Behavior Cards constitute 
a standardized interviewing technique which discriminates well 
between normal and delinquent boys when they are compared 
as to total score, as well as to their responses to certain indi- 
vidual items. The reliability coeffic ients are fairly high. The 
internal consistency values” of the more discriminating items 
are high enough to be regarded as satisfactory, while the valid- 
ity coefficients, if we may call them such, as unusually high. 
We believe that the test might have some value for the diagnosis 
of delinquent tendencies. However, we regard this use of the 


test as of minor importance. It was our primary purpose. in 
constructing the. Behavior Cards to devise 2“ low.pressure’’ 
type of test-interview which would d enable the delinquent child 
to face his problems with with a a minimum feeling of compulsion and 
external pressure. 

Very few children resent being asked to sort these cards. 
Children often refer to the cards in subsequent interviews, say- 
ing, ‘‘Do you remember those cards I sorted out for you? | 
answered some questions there that I never told anybody about 
before.’’ Many children place ecards referring to stealing, sex 
offenses, and abnormal fears and worries in the ‘‘yes’’ box 
when there was no mention made of these factors in their social 
histories. 

We do not wean to suggest that_the Behavior Cards can be 
used as a substitute for intelligent clinical interviewing. We 
do believe, however, that they can be used as a valuable aid in 
interviewing. 
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CONSUMER AND OPINION RESEARCH: 
IXPERIMENTAL STUDIES ON THE 
FORM OF THE QUESTION 
SYDNEY ROSLOW, WALLACE H. WULFECK 
AND PHILIP G. CORBY 


The Psychological Corporation 


INTRODUCTION 


LTHOUGH considerable emphasis has been placed on 
problems of sampling in questionnaire research, in re- 
cent times the questions themselves have been subjected 

to critical scrutiny. Workers in the field of market research 
have been aware of the need for experimental trials of different 
forms of the questions before a study is finally placed in the 
field.? A recent publication suggests nine principles to be 
observed in phrasing questions to be used in questionnaires.® 

It is the purpose of this paper to describe several experi- 
mental findings obtained from studies in which alternate forms 
of questions were introduced. By the form of the question is 
meant the actual word content or serial order of the words in 
the sentence. Ordinarily the form of the question is altered 
without intending to shift the conceptual reference or meaning 
thereof. Form may vary in such respects as: Order of alter- 
natives, degree of alternatives, completeness of alternatives, use 
of stereotyped words, appeal to prestige or authority, per- 
sonal or impersonal form, positive or negative statement, check 
list, opportunities for free response, etc. Any given question 


1 Roper, E., Three weaknesses of market research. Market Research, 
1938, 8, No. 6, 16-19. 

2 Coutant, F. R., Supervising the field investigation. Market Research, 
1938, 8, No. 6, 19-21. 

3 Roslow, S., and Blankenship, A. B., Phrasing the question in con- 
sumer research, J. Appl. Psychol., 1939, 23, 612-622. 
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may thus be put in one of several forms, the meaning of the 
question remaining apparently the same. Yet only experi- 
mental trial can indicate whether the responses have been 
influenced by the particular wording used. 

As an illustration of a non-experimental situation in which 
the results may have been a function of the forms of the ques- 
tions used, as well as a function of changed attitude in the 
respondent, consider the following questions reported in a 
popular monthly survey of social issues. 

“‘On the whole, do you approve or disapprove of Roose- 

velt’s international policy ?’’ 

When asked in July 1938—50. % approve 

° “August 1939—48.5% approve 

‘‘Do you approve or disapprove of Roosevelt’s policies 

with regard to the European situation up to now?’’ 

When asked in September 1939—69.2% approve 

The survey in presenting these widely different results 
cautioned : ‘‘The second question was so framed as to sharpen 
the focus upon the immediate international situation and to 
reduce the number of persons likely to answer that they had 
no opinion to the previous question. Obviously, however, the 
meaning of the questions are so similar that the replies may, 
with slight reservations, be set side by side.’’ In the absence 
of experimental controls, one may question how slight these 
reservations may be. It is entirely possible that part of the 
change in results may have been caused by the change in phras- 
ing the later question. Notice that the meaning is shifted by 
limiting the extent of the policy’s application to Europe alone 
rather than the whole world, thus eliminating, from the 
apparent meaning, the Sino-Japanese situation. 

It has been mentioned that different forms of questions are 
accomplished by alterations in the wording. Looking only at 
the words, such alterations may appear to vary from slight 
or minor changes to major rearrangement in form. Consider 
these following pairs of questions: 


a) Is the service at Blank’s reasonably good? (or) Is the 
service at Blank’s all you could expect? 
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b) Should ‘‘reds’’ and ‘‘radicals’’ in the country be de- 
ported? (or) Should members of the American Com- 
munist Party be deported? 

¢) Should our government allow American ships to carry 
goods anywhere or should our ships be kept out of war 
zones? (or with the clauses reversed) Should our goy- 
ernment keep American ships out of war zones or allow 
our ships to carry goods anywhere? 

d) Do you believe the Communist Party in the U. 8. should 
be abolished by law? (or) Do you believe there should 
be a law to prevent this country from becoming com- 
munistic ? 

The first three alternatives illustrate relatively simple changes 
in serial word order or actual word change. However, the 
change is so great in the last instance that we have actually 
two different questions. 

It is not the purpose of this presentation to contrast the 
effect of small and large changes in wording. No one ean tell 
in advance whether the extent of word change is proportion- 
ately related to change in meaning and will yield small or large 
changes in results. 

METHOD 


This paper includes the results of eight studies in which 
varying forms of questions were used. These studies were 
conducted by the Psychological Corporation for practical 
research purposes in advertising and marketing problems 
They were adaptable to a continuing research program for 
experimental evaluation of some of the effects of wording in a 
question upon the response. This adaptation of practical 
situations to experiment permitted adequately large samples 
to indicate small changes in response where smaller samples 
might have been inconclusive. 

(A) There were no preconceived notions regarding thie 
meaning content of the alternate question forms, particular!) 
to the persons’to be questioned. (B) These experimental 
forms were always made to conform to the general context of 
the interview in which they were used. 

Four distinct experimental methods were employed : 
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Method I. In the same study two separate questionnaires 
were used including alternate forms of the experimental ques- 
tions and identical forms of the control questions. In inter- 
viewing, the field worker used an alternate form for every 
other person interviewed. 

Method II. The form of the question. was changed in suc- 
cessive but similar studies conducted one month apart in the 
same cities throughout the nation. 

Method III. Using the same questionnaire, a free response 
question was asked early during the interview. Later on, 
toward the close of the interview, the answer to the free re- 
sponse question was checked by an actual inventory of the 
product in the home. 

Method IV. Compares the results of free responses with 
check list responses by utilizing the data from a preliminary 
free response question used for the purpose of constructing 
the final check list. 

The following results report the data obtained from the 
various methods. In every case the method number is given 
for purposes of identification. 


RESULTS 
Method I 
Study 1. In this experiment, a measure of the influence of 
prestige or emotional tone was obtained by referring to a 
famous person. Two questionnaires were asked of comparable 
samples of 2000 persons each. Each questionnaire included 
one of the two experimental forms of the question. The two 
forms of the question were: 
Do you like the idea of having Thanksgiving a week earlier 
this year? 
Do you like President Roosevelt’s idea of having Thanks- 
giving a week earlier this year? 


The first alternate secured 16.7% affirmative replies, while the 
second secured 21.4%. Thus the influence of Roosevelt’s name 
was apparent. This result is in keeping with the findings of 








€ 
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other workers who reported that agreement or endorsement of 
statements is altered when these statements are labeled with 
stereotypes such as Fascist, Communist, Socialist, ete.*’® 

Study 2. In a nation-wide survey conducted during Oc- 
tober 1939 two questionnaire forms were developed. Each 
form included four experimental questions, in which the 
phrasing was changed for the two forms. These questions 
were placed among other questions which were the same in 
both forms and served as controls. Each form of the ques- 
tionnaire was asked of 3200 individuals. To insure com- 
parability of the two samples, the questionnaires were asked 
of alternate persons. 

The results from eight control questions given below indicate 
the comparability of the samples and the agreement of the data 
obtained with identical forms of the questions. 


Question 
Control questions “— rs 
%  % 

1. Have you ever been interviewed to Yes 29 3.7 
express your opinion on the Gallup No 91.3 90.5 
or Fortune Polls? Don’t know 5.8 5.8 

2. Have you written or are you plan- Wrote 45 5.1 
ning to write your Senator or Con- Plan 41 45 
gressman on the present neutrality Neither 91.4 90.4 
debate? 

3. Do you think a third term for Yes 30.3 30.9 
Roosevelt would be a step toward No 54.3 54.2 
dictatorship ? Don’t know 15.4 149 

4. Do you think organizations such as Government 29.1 28.4 
the Red Cross, YMCA, orphanages, Community Fund 29.3 28.5 
and hospitals should be supported Raise own 22.4 22.6 
by the government, the Community Don’t know 11.6 12.0 


Fund, or should they raise their 
own money? 


4 Raskin, E., and Cook, 8S. W., A further investigation of the measure- 
ment of an attitude toward Fascism. J. Soc. Psychol., 1938, 9, 201-206. 

5 Menefee, S. C., The effect of stereotyped words on political judgments. 
Amer. Soc. Rev., 1936, 1, 614-621. 
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Question- 


‘ ; naire form 
Control questions A ie B 
of, 67 J 
o Cc 
Has anyone in your family been to New York 28.1 28.0 
the World’s Fair in New York, in’ San Francisco 8.0 8.5 


San Francisco? Neither 66.6 66. 
Do you have an automobile? Year Yes 62.9 62. 
of model? No 37.1 
1940, 39, 38 “{ 
1937, 36, 35 41.1 
1934, 33, 32 11.4 
1931 and earlier 11.9 
Do you have an electric or auto- Yes 
matie refrigerator? No 
In the last election did you vote Roosevelt 
for Roosevelt, Landon? Other— Landon 
Didn’t vote? Other 
Didn’t vote 
Refused information 


Results from Experimental Questions 


Questionnaire A Questionnaire B 

Do you think that Yes 24. Do you think that 
advertising is less No 56. advertising is more 
truthful today than D.K. , truthful today than 
it was a year or two it was a year or two 


ago? ago? 


From which coun- E&F 10.1 From which coun- 
tries are we now Ger. 33.3. tries are we now 
getting more false Same 38.6 getting more prop- 
news stories: En- D.K. 18.0 aganda: England 
gland and France, and France, or Ger- 
or Germany? many? 


Do you believe that Do you believe that 
the Communist Party the U. S. is on the 
in the U. S. should -K, way to Communism? 
be abolished by law? 
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Questionnaire A Questionnaire B 
Brand Brand 
4. Whose dentifrice ad- A 24.0 Whose toothpaste or A 25.9 
vertising have you B 17.7 powder advertising B16. 
seen or heard lately? C 14.0 have you seen or C 16.0 
D 14.4 heard lately? D = 13.9 
E 13.9 E 128 
F 2.7 F 24 
G 2.7 G 9] 
H 1.4 m 6g 
I pe | I 1] 
J 9 J ] 
K 8 K 1 
L 4 L 7 
M 5 M 6 
N 5.5 N 4.0 


Note the high degree of correspondence in the percentages 
representing the replies to the control questions, whereas in 
the first question concerning advertising, the differences in 
results are marked and indicate the strict adherence to ques- 
tion form with which these results must be interpreted. For 
example, 56.5% answered no to the less truthful question 
We are not justified in assuming that these people therefore 
think advertising is more truthful because the answer to that 
question included only 46.7% affirmative replies. Also note the 
increase in Don’t know answers in the second form, 22.1' 
compared to the first form, 18.8%. 

The second question revealed the influence of the change 
from propaganda to false news stories. From the responses 
it would appear that almost as many people believe we get 
‘more propaganda’”’ from England and France as believe we 
get ‘‘more propaganda’’ from Germany, 23.0% and 28.2“ 
respectively. However, only one-third as many believe we 
get more false news stories from England and France as 
believe we get ‘‘more false news stories’’ from Germany, 10.1% 
and 33.3% respectively. 

The change in phrasing of the third question on Commt- 
nism is so marked that in effect two different questions were 
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asked. The results, of course, are entirely different. It is in- 
teresting to note, however, that although a majority, 67.5%, 
state that they do not believe that the U. 8S. is on the way to 
Communism, still a small majority, 54.1% report that they are 
in favor of abolishing the Communist Party by law. 

In the ease of the fourth question, a free response question 
on brands, the use of the words, dentifrice on one hand, and 
toothpaste and powder on the other, does not introduce any 
significant changes in results. These terms apparently are 
accepted by people in general as being synonomous. This fact 
is further substantiated in the case of Brand B which is a liquid 
tooth cleanser. If the words toothpaste and powder were 
interpreted literally by the respondents, one would expect the 
proportions giving Brand B as the answer to differ markedly. 

In general, there is evidence by this method to support the 
contention that form of the question has a direct and important 
bearing upon the results. 

Method II 

Study 3. The words used in expressing alternatives will 
also influence the results obtained. In two surveys, one month 
apart, the tollowing were asked of comparable samples of 
7867 people in each. 

Form A—‘‘ Which of these companies do you think well of 
generally, which not so well???’ (Company 
name given and response recorded) 

Form B—‘ Do you think favorably or unfavorably of the 
following companies?’’ (Company name 
given and response recorded) 

Essentially these questions are similar except for the change 
from well and not so well to favorably and unfavorably. 


Responses 
Form A, Form B, 
Company Well Favorably 
To %o 
A 79.4 67.6 
B 58.3 46.9 
Cc 52.3 46.8 
D 85.5 74.1 
E 72.4 62.1 
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It can be said that the term favorably was apparently inter- 
preted as being a more extreme alternative and therefore 
received fewer responses than the term well. Although there 
was a time interval between the occurrence of these two 
surveys, it is not believed that time alone caused the change 
to the extent indicated by the results, but rather that it was 
the phrasing of the alternatives which produced the changes. 
It is important, then, to use constant phrasing in order to study 
periodically changes in response. The absolute figures are of 
less importance compared to the relative differences and the 
trends indicated. 


Method III 


Studies 4to7. In four product studies the inaccuracy intro- 
duced by free response questions was investigated. The re- 
sults obtained support the belief that free response questions 
frequently elicit incorrect information because of the inter- 
viewee’s unreliable memory or because of the advertising 
intensity of certain brands or because of the presence of a 
prestige component. 

In the first study of flashlight battery cells this question was 
asked of 967 owners of flashlights: What make of flashlight 
cells did you buy last? Brand X was reported by 62%. Later 
on during the course of the interview the interviewer obtained 
access to 852 flashlights and actually saw and recorded the 
brand of battery cell in use. The actual per cent of Brand 
X dropped to 55. In another study of 1600 flashlight owners, 
504 owners would not show the flashlights. Seventy-five per 
cent of these 504 owners reported having purchased a certain 
brand whereas inspection of the remaining 1096 owners re- 
vealed only 47% of the flashlights contained this brand. 

Similarly in a study of razor blades the following question 
was asked: 

What make of razor are you using? 








CONSUMER AND OPINION RESEARCH 


Responses 
Brand No. reported No. inspected % correct 
X 46 42 91 
Y 8 5 62 
Z 18 17 94 
Total 125 115 92 


Although the N’s are small the tendency to report brands 
incorrectly is consistent. 

In the fourth study on anti-freezes the answers to the ques- 
tion: What brand of anti-freeze do you have in your radiator 
this winter? brought the following results which are compared 
with laboratory tests of 181 samples drawn from the radiator: 

98 reported brand A—4 had water, 10 alcohol, 6 brand B, 78 

brand A 
50 reported brand B—3 had water, 23 alcohol, 23 brand B, 1 
brand A 

33 reported brand C—30 alcohol,* 3 brand B 
In this instance, memory was undoubtedly an important factor 
as well as advertising intensity because the study was made 
toward the close of winter whereas the anti-freeze was bought 
early in the winter. There is also the possibility that a substi- 
tute brand was sold. Of those reporting brand A only 80% 
had Brand A; of those reporting other brands, only 1% had 
Brand A. Although the N’s in this study are also small be- 
cause of the difficulty and expense in obtaining actual inspec- 
tions or samples, the results are all consistent in exposing the 
possibility of error when the acceptance of respondent’s 
answers is based alone on free response questions. Depending 
on the degree of accuracy vital to an investigation, other forms 
of questions involving check lists, or visual aids, or actual 
inspection may be necessary to supplement free response ques- 
tions. 


Method IV 


Study 8. The check list introduces the possibility of error 
because of omissions of alternatives. For example, a recent 


* Brand ( is essentially alcohol, and therefore, these 30 samples ma 
’ ’ 
or may not be correct responses. 
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non-experimental study by an organization doing periodic 
social surveys included the question and alternatives shown 
below. The question was asked in two surveys one month 
apart. These surveys are reputedly based on 5000 interviews 
with comparable people. In the second survey, alternative 
No. 2 was dropped and an additional alternative, No. 6 was 
offered the respondent. 


Which of these courses of action comes closest to describing 
what you think the U. 8. should do? 


Results obtained in 
September October 


Oo 
% /0 
“1. Enter the war at once on the side of England, 

France (and in September—Poland) and send 

an army to Europe 2.3 t7 


2. Enter the war at once, but send only our navy 
and air force to help England, France and 
Poland 1.0 not asked 

*3. Enter the war on the side of England, France 

(and in September, Poland) only if it looks 

as though they were losing, and in the mean- 
time help that side with food and materials 13.5 10.1 

*4. Do not enter the war, but supply England, 

France (and in September, Poland) with ma- 

terials and food and refuse to ship — 
to Germany , 19.9 12.2 

Take no sides and offer to ‘sell enything to 

anybody, but make them pay cash and take it 
away in their own ships 29.3 36.9 

6. Refuse to sell actual war munitions but sell 

the raw materials that go into the making of 


on 


war supplies to anyone not asked 6.4 
7. Refuse aid of any kind to either side, and re- 

fuse to sell anything at all to either side ~ ae 23.7 
8. Find some way of supporting Germany a ol | 
9. Other ene 3.4 3.0 
10. Don’t know 5.8 5.9 


*In October, Poland was omitted from these questions. 


The comparison of results for the two months must be made 
cautiously and with reservations because of the omissions 0! 
the second and sixth alternatives. Presumbaly people could 
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have held the opinions indicated by these alternatives at the 
time of the respective surveys, but were given no adequate 
means of expressing them. Furthermore, they may have 
altered the patterns of choices and have obscured the changes 
in response. 

If the check list is exhaustive there is little danger of bias, 
but if certain possible alternatives have been omitted, the 
answer may be greatly affected by the omission. The useful- 
ness of the check list depends upon the validity of the items, 
their completeness, and their order, and these can be developed 
properly only through making trial studies and checking with 
other question forms. For example, in a study on gasoline 
these questions were asked of 2000 car owners in the same 
survey. 

A. What do you think are the most important qualities 

about a gasoline? <A free response question. 

B. Which of the following qualities of a gasoline do you 

regard as important? a) power...., b) mileage 
ec) pickup...., d) the earbon...., e) quick starting... 
f) no knock...., g) smooth performance...., h) reliable 


company...., 1) hill climbing...., j) octane rating....., 
k) others 

Rank order correlation between order by frequency of the 
responses to these two forms was .93, indicating that no great 
difference in relative results were obtained with these two 
forms of questions. 

Thus the second check-list study shows that it is possible to 
construct a check list which will secure reliable results. But 
the criteria of completeness, validity and order must be met 
through preliminary experimentation. 


SUMMARY 


The findings reported in this paper through the use of four 
different experimental methods reveal : 

1. That the use of stereotypes or emotionally charged words 
roduce significant changes in the frequencies of positive nega- 
ive or don’t know responses. A corollary of this is that ques- 
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tions affecting the prestige of the respondent, either construc- 
tively or destructively, tend to affect the response. 

2. That slight alterations in wording may or may not bring 
about shifts in the number of responses in either direction, 
So far no experimental evidence has been adduced to isolate 
the many possible determiners of such shifts or their relative 
magnitude. 

3. That results from free response questions may be mis- 
leading when the memory of the respondent and/or familiarity 
with possible responses operate to any appreciable extent. 

4. That the proportions obtained by alternatives in check 
list questions tend to be influenced by the number and com- 
pleteness of the alternatives presented. 

5. Clearly, that interpretations of survey results must adhere 
closely to the form of the question used if they are to possess 
any reliability. Results can be easily invalidated when the 
interpretations go afield or lose their connection with the ques- 
tion as a unique entity possessing qualities relatively peculiar 
to itself. 








THE BEANE POLL OF FAVORED PSYCHO- 
LOGICAL TESTS 
BETTY BEANE, JOHN CARROLL, anp STEPHEN HABBE 


Work Projects Administration of Connecticut 


MERICANS, it is said, will take a poll at the drop of 
a hat. Some observers are fond of reporting that our 


citizens believe they can solve controversial problems 
by the vote. The ultimate in naiveté, they point out, is the 
American’s attempt to discover the truth by counting noses. 

Psychclogists are not guiltless of these charges, although it 
might be expected that their training would lead them to be 
less gullible than the average citizen. When psychologists 
deliberately set about taking a poll among themselves, a word 
of explanation is demanded. 

This is the heyday of the psychological test. There are 
thousands of them on the market and new ones appear almost 
daily. It is a full-time job simply keeping an up-to-date list 
of them. For practical purposes the clinical psychologist 
chooses a dozen or so tests which he masters and uses regularly. 
He has had experience with these tests and he knows what they 
will do and what they will not do for him. He adds other tests 
to the favored group cautiously. While he knows many other 
tests by name and probably is interested in learning of new 
ones as they are announced, he does not have the leisure to 
experiment with them in order to determine their values. It 
might be added that many psychologists lack the facilities to 
check for themselves the standardization of many of the tests. 
It might as well be admitted, too, that some psychologists would 
not know how to go about an experimental or statistical evalua- 
tion of tests even if the conditions for such a study were 
provided. 
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Two very practical reasons, in addition to the usual academic 
interest, prompted the Beane Poll, which happy name hereafter 
will serve to designate the central matter of this study. The 
WPA in Connecticut was interested in making psychological 
filmstrips for school use and in providing vocational guidance 
for young adults. An evaluation of available tests by experi- 
enced psychologists was needed for both purposes. When the 
Beane Poll was ‘‘in,’’ it was found to fulfill the two purposes 
admirably since it showed clear-cut preferences for certain of 
the tests on the lists. 

In describing the Beane Poll and, particularly, in announe- 
ing the results, the writers desire to emphasize the limitations 
of their work. There is no claim that this poll of favored tests 
is at the same time a list of the best tests. Only seventy-four 
of several thousand American psychologists participated in the 
Beane Poll. Ratings were not requested on all tests and con- 
sequently a number of scales, possibly some as good as, or 
better than, those included, were excluded arbitrarily. Again 
there is evidence that incidental and extraneous matters in- 
fluenced the vote on the Beane Poll. These considerations and 
other ones should be kept in mind during the reading of the 
remaining pages of this article since they have an important 
bearing upon the meaning of the Beane Poll as well as upon 
the proper interpretation and use of it. 

From- the 1937 directory of the American Psychological 
Association a list was prepared of full members of the Asso- 
ciation who expressed an interest (a teaching and/or a research 
interest) in psychometrics. One hundred forty-one names 
were found to meet these specifications. These psychologists 
were asked by letter to rate sixty-one tests." 

Help in the selection of the sixty-one tests which made up 


1For the reader who is curious to know other characteristics of this 


group of raters, the following facts are appended: an overwhelming ma- 
jority were college professors of psychology with several advanced degrees, 
usually including the Ph.D. from an American university; thirteen were 
under forty years of age, twenty-eight were over sixty years of age, and 
the median was forty-nine years old. 
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the original lists was obtained from several sources. About 
four years ago in this Journal Helen Pallister reported a simi- 
lar poll of vocational tests (1). Sixteen of the twenty-four 
tests described at length by Bingham (2) were included. 
Non-eritical listings of tests, such as Hildreth’s manual (3), 
and critical listings, such as the Mental Measurement Yearbook 
series edited by Buros (4), were consulted. Finally, the 
opinions of a number of psychologists in New Haven were 
solicited,—and then the writers exercised their own judgments. 

The sixty-one tests were arranged alphabetically in groups 
of about twenty each under three headings—Intelligence Tests, 
Mechanical Aptitude Tests, and Personality and Interest 
Tests.2 Other groups of tests—such as sensory tests, educa- 
tional tests, art tests, and music tests—were excluded, being 
less than vital to the purposes of this study. Each of the three 
groups of tests retained was mimeographed on separate sheets 
of paper. A box for voting was placed in front of the name 
of each test. The following directions appeared at the top of 
the first sheet : 


Please select five tests from this and each of the follow- 
ing divisions which you believe would be helpful in a 
guidance program in secondary schools and colleges. 
Kindly number them from one to five in the order of your 
preference. 


A space at the bottom of each sheet was provided for ‘‘Re- 
marks or Additional Tests Suggested :’’ the infrequency with 
which this space was used led the writers to feel that the sixty- 
one tests listed were generally approved. 

The three mimeographed sheets, together with a brief letter 
of explanation, were clipped together and mailed to the 141 
psychologists. A franked, addressed envelope was enclosed to 


facilitate the return of the papers. Table 2 shows the tests 
included. 


2 Others before us have pointed out that such classifications of tests are 
arbitrary to a considerable degree: the practice is defended here solely for 
reasons of expediency. 





350 B. BEANE, J. CARROLL AND S. HABBE 


Eighty-eight replies were received. Table 1 classifies them. 


TABLE 1 
Response to Questionnaire 


Item Number Per cent 


Questionnaires mailed 


141 100 
Questionnaires returned 74 
Letters received, in lieu of questionnaires 10 
Questionnaires returned unclaimed + 
No replies 53 





Only the 74 completed questionnaires were used. Many of 
these were scored by check marks only; few of the psycholo- 
gists bothered to distinguish among their choices according to 
the 1-to-5 rating plan. As a result it was necessary to estab- 
lish the final rankings simply by totaling the number of times 
a test was marked, whether this mark was a check or a number- 
rating. Thus, the revised Binet test, which stood second 
among the intelligence tests, was marked in one way or another 
by 47 of the 74 raters. (See Table 2.) 

The ten letters should be mentioned, although their contents, 
unfortunately, could not be tabulated for inclusion in the 
Beane Poll. Four psychologists advised the writers that they 
were not competent to conduct this poll. On the other hand, 
two psychologists modestly disclaimed enough knowledge to 
mark the tests intelligently. Similar numbers gave it as their 
opinions that tests had no place in vocational guidance and 
that tests should not be photographed for classroom use. It 
would be interesting to know if the ten letter-replies repre- 
sented the thoughts of the fifty-three psychologists who failed 
to answer the questionnaire. This is a matter for pure con- 
jecture, and the writers prefer to think that the 60 per cent 
response elicited represents a fair and adequate cross section 
of current thought on the subject of tests by selected members 
of the American Psychological Association. 

Table 2—the Beane Poll—shows the tests arranged in order 
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of choice with the frequencies or ‘‘votes’’ recorded for each. 
It will be noted that all the tests received at least one vote. 
The Strong Interest Blank received the most votes—fifty-eight. 
There was less agreement on the mechanical aptitude tests than 
on the others where the consensus was surprisingly high. In 
all, 367 votes were cast for the twenty intelligence tests, 345 
for the twenty-three personality and interests tests, and 300 
for the eighteen mechanical aptitude tests. Undoubtedly, the 
intelligence tests were more familiar to the raters than the 
other tests. 

The Beane Poll was taken during February, 1938. Since 
that time the results have been used by the Connecticut WPA 
in making psychological filmstrips for teaching use and as a 
basis for test selection in the Adult Guidance Services of 
Bridgeport, New Haven, Hartford, and Stamford. The New 
Haven Guidance Service has collected and published a large 
amount of pertinent information about some of the highest 
ranking tests in the three sections of the Beane Poll, and this 
material, entitled ‘‘Tests and Seales,’’ is offered for free 
distribution to members of the American Psychological 
Association.® 
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THE RELIABILITY OF NEW-TYPE OR 
OBJECTIVE TESTS IN A NORMAL 
CLASSROOM SITUATION 


HAROLD D. CARTER anp AILEEN P. CRONE! 
University of California School of Education 


I. INTRODUCTION 


Q what extent must one lengthen new-type or objective 
tests in order secure satisfactory reliability and 
validity coefficients? The majority of persons using 

such tests in the past have proceeded on the assumption 
that the best way to make them reliable is to increase the num- 
ber ofitems, The statistical formula implied is the Spearman- 
3rown prophecy formula. Recently, a few_wyiters (1, 6) 
have recommended somewhat more specifically than others the 
improvementt_of reliability and validity of tests merely by 
omission of no non-valid or unreliable parts, without the addition 
of new items. The statistical formula implied is that for the 
correlation of simple sums. 

These two viewpoints are not directly opposed, as is clearly 
seen through consideration of the prerequisites to correct ap- 
plication of the Spearman-Brown formula. Yet the implica- 
tions for procedure in the practical use of tests are not entirely 
clear. The prevalent technique of item-analysis is used by the 
average worker with the assumption that most unimproved 
tests can be shortened without loss in efficiency, but there is very 

ttle evidence as to the extent of improvement which can be 
brought about by such methods. The improvement is ordi- 
narily sought through the addition of new and better items. 

Empirical approaches to the problems involved are much 



































1The writers wish to acknowledge their indebtedness to Dr. Herbert 8. 
Conrad, who read and criticized the manuscript. 
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needed for integration of theory and practice. aderson (1) 
and Flanagan (6) have suggested, on the basis of theoretical 
and empirical considerations, that a given test _can be short- 
ened and at the same time improved in reliability and validity 
through omission of some parts, and without addition of new 
material. This does seem reasonable, from consideration of 
the variability in difficulty and validity of test items. But 
many studies,exploring many typical sets of tests, are neces- 
sary to demonstrate _how far the implications of the theory 
can be realized. The presentpaper reports one such study 
aimed at exploring some of the effects of elimmation of the 
least_reliable components from examination material used in 
a college course. 





























II. THE DATA 


The data employed are a set of examinations in a first course 
in educational psychology. The set consisted of three mid- 
term tests and a final examination ; the mid-term tests were 
each 50-minute tests and the final examination was a two-hour 
test. Each examination included true-false, multiple-choice, 
and completion sections ; the length of each section will be made 
apparent in the presentation of the tabular results of the 
study. 

The course is divided into four parts. The first mid-term 
test covered part one, which is concerned with the extent and 
measurement of individual differences. The second mid-term 
test covered part two; the topic is factors in the causation of 
individual differences. The third test dealt with social and 
emotional aspects of development. The final examination em- 
phasized the psychology of learning, but included questions 
bearing upon the other topics as well. 

The range of talent in the groups studied is perhaps slightly 
restricted. The students were nearly all college juniors; a 
few seniors were included. The school of education in which 
this study was conducted restricts its enrollment to upper 
division and graduate students. In addition, the requirement 
of a 1.5 honor-point ratio for work toward a teaching credential 
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furnishes students in the school with a special motive for high 
scholarship. The honor-point ratio is computed by assigning 
three points for an A, two for a B, one for a C, a zero for a D, 
and minus one for an F. The evidence indicates that the stu- 
dents in the school are strongly motivated to secure high marks. 
Tests need to be rather difficult for satisfactory discrimination 
within this group, and the achievement of high reliability is of 
colirse Somewhat more difficult than in lower-division classes. 

The tests used in this study had not been previously item- 
analyzed or validated, and they were not specially constructed. 
The selection of tests of an ordinary sort, used for the first 
time, was intentional. The instructor in the course made up 
all the questions, following such rules and procedures as indi- 
cated in the well-known manuals by Paterson (10), Ruch (12), 
Hawkes, Lindquist, and Mann (7). Like most psychologists, 
the instructor has had extensive experience in the construction 
of new-type tests. However, as is often the case, he sometimes 
had very little time to spend upon the construction of tests. 
This is probably the normal course of events. It should be 
expected to result in production of a set of tests varying some- 
what in quality, and considerably less reliable than the new- 
type test methods permit under more favorable conditions. 
This is exactly the sort of set of examinations desired for the 
present study. Its probably much more representative of 
current practice than a set of item-analyzed, validated, and 
scientifically improved tests. 























III. PROCEDURE AND RESULTS 


The treatment of the data has been arranged to find out as 
much as possible about the reliability of a set of examinations 
used under the conditions described. The first specific pur- 
pose is to find_out how reliable the total score is, for each test 
and for the.set_as.a whole. A second purpose is to discoyer 
the effects of several methods of eliminating unreliable parts 
from the tetelsetof examinations. A third purpose js to set 
forth_simple principles which may be applied by any teacher 
who wishes to_ improve his course examinations insofar as it 
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can be dope without recourse to methods which are unduly 
time-consuming. | 

Table 1 summarizes the data with respect to the reliability 
of the tests, their parts, and the sums of the several tesfS. The 
reliability coefficients have been computed for each part and 
for the sums by correlation of scores on odd and even halves, 
and subsequent adjustment of these correlations by means of 
the Spearman-Brown formula. As_expected, these tests and 
their Darts vary greatly jn reliability. A secgnd fact, clearly 
shown, is that the completign sections, in spite of being shorter, 
are definitely more reliable th; ynition sections. 
This is in agpeement with other studies, exemplified by the 
recent work of Bird and Andrew (2, 3, 4). 

Two further facts are revealed by scrutiny of Table 1. if 
type of question is held constant, the effect of cumulation is 
always to increase the reliability of the resulting total. This 
is true in spite of the fact that some of the parts included in the 
total are very unreliable. This suggests, but does not prove 
conclusively, that it is ordinarily not easy to improve the reli- 
ability of a test total by elimination of unreliable parts. See- 
ondly, considering the individual tests, addition of the different 
parts results in a total more reliable than any of the parts, in 
every case but one. For test_three, the completion section 
taken alone is slightly, but not dependably, more-reljable than 
the total test. This general picture, then, clearly suggests that 
each of the unreliable parts makes a positive contribution to 
the reliability of the total. This finding is not surprising to 
those who would apply the Spearman-Brown formula un- 
critically, but it may be surprising to persons who have made 
empirical study of the difference between a multiple correla- 
tion and a simple correlation of sums. 

A reasonable explanation of these results is not hard to find. 
It is possible that the effects of unreliable elements in one sec- 
tion are cancelled by the effects of some elements in other sec- 

















2A different approach to this problem is concerned with reliability per 
unit of working time. A discussion of this approach is not attempted in 
the present report since the data are not suitable. 
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TABLE 1 
cations of the Difficulty, Length, and Reliability of a Set of New- 
Examinations Which Have Not Been Item-Analyzed, Pre viously 


uidated, or Improved on the Basis of Experience. 


Correlations Based 
1 143 Cases. 


No. of Per cent Relia- pay 
astions igh . ility it 
question right bility items 
Test I 
rue-False 
Mult.-Choice 
my} letion 
Total 


Test Il 
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mpletion 

Total 
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tions; as a result, the unreliable elements do little or no harm 
to the reliability of an extensive total. 

The observations above are primarily concerned with the 
actual reliabilities shown in the next to last column of Table 1. 
The last column shows the reliabilities that would be obtained 
for such tests if they were all of equal length; the coefficients 
have been corrected by the Spearman-Brown formula, to show 
the reliabilities to be expected on the basis of 100 items. Here. 
as in other studies, one sees evidence of the marked superiorit) 
of the completion items. The data also indicate that the excel- 
lence of examinations prior to improvement on the basis of 
empirical study tends to fluctuate considerably. The_best 
qultiple-choice section (Test 1) is better than the poorest com- 
pletion section (Test II) ; likewise, the best true-false section 
(Test IV) is almost as good as the poorest completion section 
(Test Il). These data strongly suggest that the easier recog- 
nition questions, although they tend to be inferior to recall 
questions, can be so selected as to yield short tests with very 


espectable reliabilities. 
The data in the last four rows of Table 1, at the bottom. 


indicate some additional facts. The omission of Test II, which 
was by far the most unreliable of the four, has resulted in totals 
which are slightly less reliable than the totals of all four tests. 
From a practical standpoint, it can be argued that the total of 
three tests is almost as good as the total of four. But the con- 
clusions with respect to the theoretical questions under con- 
sideration in this paper are quite different. These data fur- 
nish an additignal indication that one cannot easily make test 
totals more reliable by elimination of Uprehable parts. What 
is even more striking is the evidence from the four figures at the 
bottom of the extreme right-hand column. Tl ciency of 
the teats is not only not raised by omission of the most unreli- 
able of the four, but jt js actually slightly lowered. 

The figures for percentage of questions correct, shown in 
column two of Table 1, can be inversely ranked to obtain an 
index of the difficulty of the various sections and totals of test 
material. It is evident from the table that the more difficult 
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sections tend to be the more reliable. There is of course an 
association in these data between difficulty and type of ques- 
tion, but the relation between difficulty and reliability is seen 
even when only one type of question is considered at a time. 
A rank difference correlation between the appropriate columns 
of Table 1 indicates a marked relationship between difficulty 
and reliability (Rho is .51). When one correlates length with 
reliability Rho is .62. These data suggest that one can make 
these examinations more reliable by substitution of harder 
questions for some of the easier ones. Because of time limita- 
tions it would not be convenient to make the tests longer as a 
method of increasing reliability ; and as Bird and Andrew (3) 
have noted, the labor involved in the use of long tests is not 
negligible. 

Table 2 presents the means, standard deviations, and inter- 
correlations between scores for the examinations of the course. 
The average inter-correlation is .55; there is no great range of 








TABLE 2 


Means, Standard Deviations, and Intercorrelations of Scores of 143 Stu- 
dents in a Course in Educational Psychology 


Test III Test IV 


Test I Test II 


Mean 59.10 70.12 66.88 83.58 
8.D. 8.25 4.32 6.09 10.24 
Correlations: 

Test I 56 

Test IT 88 

Test III .76 .87 

Test IV .76 84 


Note: The figures below the diagonal are the correlations corrected for 
attenuation. 


values of inter-correlations. Correction for attenuation by 
Spearman’s formula (see reference 8, formula 155, page 204) 
results in inter-correlations which have an average value of .82. 
The evidence shows that the four tests are measuring aspects 
of achievement which are not entirely the same. This is of 
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course as one should expect, in view of the fact that each test 
covered a particular section of the course. These inter-correla- 
tions furnish a basis for comparison with those to be obtained 
from improved tests; in view of the conditions under which 
the tests were arranged, it is of course not possible to deduc, 
the exact effect of improvement of the tests upon their inter- 
correlations. 

Table 3 shows the means and standard deviations of scores 
and the inter-correlations between scores, for the totals of the 
true-false, multiple-choice, and completion sections of the 


TABLE 3 


Means, Standard Deviations, and Intercorrelations of Scores for Tru 
False, Multiple-Choice, and Completion, Totals Based upon Four Tests, 
Data from 143 Students. 


115 


. 100 
True-False Multiple- Completion 
Choice 


questions questions 





questions 





Mean 143.08 80.26 56.43 


8.D. 8.36 8.01 10.03 
Correlations: 
True-False .69 .70 
Mult.-Choice x. 12 
Completion 86 90 


Note: The values given below the diagonal are the correlations cor 
rected for attenuation. 


examinations. The stand: ations indicate that the 


shortes-harger completion tests are more discriminating than 
the nition tests. The correlations in Table 3 may be inter- 
preted as evidence concerning the extent to which examinations 
of different types covering the whole course will tend to measure 
the same thing. The correlations are fairly high when cor- 
rected for attenuation; they suggest that these different types 
of questions tend to measure very similar, but not identical, 
components of achievement. 
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IV. THE METHOD OF REVISION AND THE RESULTS OF ANALYSIS 
OF THE REVISED TESTS 
Since the analysis indicated some need for improvement of 
eee ee - 
the tests, revision was un f The evidence shows that 
some of the tests were much too easy,® and that improvement 





might be brought about through item-analysis and revision. 
Since the study is concerned primarily with simple and prac- 
tical procedures, no elaborate method of revision was under- 
taken. The procedure used is one which may be applied by any 
instructor or test user, with a minimum of labor. 

The best twenty and the poorest twenty papers were chosen 
from the group of 143 on the basis of total scores. This was 
done for each examination separately. A tabulation was made 
of the errors on each item in the test, separately for the twenty 
best and twenty poorest papers. Items were regarded as un- 
desirable if too easy (fewer than 10 per cent errors) or too 
difficult (more than 90 per cent errors) or negatively dis- 
criminating (more errors by good students than by poor stu- 
dents). A small proportion of the items was rejected from 
each test, and these items were replaced by new (untried) 
items. The entire procedure of revision of a single test 
required about three hours’ work. 

The data in TX Aindicate the nature and extent of the 
changes made in accordance with this very simple and objec- 
tive technique of revision. The total nwuber of questions has 
been increased from 385 to 398 Only tes ’ has been in- 
creased_in length ; each of_the three mid-term tests has been 

a od 
shortened slightly. Almost a third of the true-false questions 
were excluded after the first tryout; about ten per cent of 
the multiple-choice and about five per cent of the completions 
were excluded. The writers believe that this is a representative 





In ascertaining the most desirable difficulty-level of examinations, for 
practical purposes it is important to consider both statistical principles 
and evidence concerning student rapport and student reactions. Both 


types of evidence suggest that some of the tests used in this study were too 
easy, 
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TABLE 4 
Indications of the Nature and Extent of the Revision of the Tests 


Original No. New New 


No. of thrown items total 
items out added number 


Test I 
True-False 
Mult.-Choice 
Completion 

Total 

Test II 
True-False 
Mult.-Choice 
Completion 

Total 

Test III 
True-False 
Mult.-Choice 
Completion 

Total 

Test IV 
True-False 
Mult.-Choice 
Completion 

Total 

Totals 
True-False 
Mult.-Choice 
Completion 

Total 
indication of the need for revision of tests of these types con- 
structed by experienced individuals. 

Table S.shows the effeets of the revision upon the_yeliability 
of the tests, when they were appliec ext semester’s class. 


Although three of the tests were shortened, they were made more 
‘the change in content, The final examination, which 
was increase¢ s not become noticeably more reli- 
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TABLE 5 


Indications of the Difficulty, Length, and Reliability of a Set of New- 
Type Examinations Which Have Been Improved through Item-Analysis. 
Correlations Based upon 125 Cases. 


Reliability 
per 100 
items 


No. of Per cent Relia- 
questions right i bility 


Test I 
True-False 
Mult.-Choice 
Completion 

Total 


Test II 
True-False 
Mult.-Choice 
Completion 

Total 


Test III 
lrue-False 
Mult.-Choice 
Completion 

Total 


Test IV 
True-F alse 
Mult.-Choice 
Completion 35 
Total 155 


Totals 

True-False 153 78 

Mult.-Choice 145 66 

Completion 100 62 .76 86 
Total 398 70 Eo 91 


able. The efficiency of the tests as indicated by the reliability 


per 100 questions has been markedly improved except in the 


case of the final examination. The assumption that the two 


classes are comparable is apparently justified by all the in- 
formation avails 1€ TITS basis for optimism in 
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the interpretation of the findings is afforded by the fact that a 
marked increase in reliability is found for the test which 
obviously stood most in need of revision. The evidence indi- 
cates that from the standpoint of reliability at least, a test’s 
excellence is somewhat related to the care taken in preparing 
and revising the test. The practical significance of these find- 
ings relates to practices in the repeated use of objective 
examinations. 

One such study as this does not answer all the questions 
which.might be raised. A troublesome fact is that reliability 
coefficients are themselves unreliable. In the present study, 
empirical evidence of the significance of the trends in the data 
is offered, by presenting evidence from four tests rather than 
one. The drop in reliability of the completion part of test 
three may be attributed to unreliability of the coefficients of 
correlation, since this test remained unchanged. The only 
other drop in reliability is for the true-false section of test 
four ; since this section was considerably altered and increased 
in length it may be assumed that the new parts are unsatis- 
factory and that the test needs further revision. The other 
ten test-sections revealed increases in reliability. 

In spite of the revision, the final totals show no-reliaple in- 
creases ig_reliability. The obtained reliability coefficient for 
the new total of all the parts is .91 as compared with the co- 
efficient’ of .92 prior to revision. The results suggest that 
raising such high reliability coefficients through a simple pro- 
cedure of revision is not easy. Further data would be needed 
in order to find out whether sampling differences can account 
for this result. The effects of further revision should also be 
examined when additional data are available. 

An indication of the efficiency of the tests is furnished by 
the reliability coefficients per 100 items. These data were 
obtained by application of the Spearman-Brown formula. 
They indicate that the simple revision of the tests has in general 
improved the efficiency of the tests except in the final examina- 
tion. These figures also indicate the reliabilities which may be 
expected from short tests of the three types. The completion 
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tests are definitely superior in this respect to the two other 
types of test, but the data show that very respectable reliabil- 
ity coefficients can be achieved through the use of true-false or 
multiple-choice tests of 100 items. 

Table 6 presents the means, standard deviations, and inter- 
correlations for the revised tests. Some of these data are 
difficult to interpret because more than one factor has been 


TABLE 6 


Means, Standard Deviations, and Intercorrelations of Scores on Revised 


Tests in a Course in Educational Psychology. Data from a Class of 125 
tudents. 


Test IV 


Test I Test IT Test IIT 


Mean 52.93 61.98 59.70 103.48 
S.D. 5.96 7.68 6.78 12.31 
Correlations: 


Test I 41 A7 46 
Test II .50 54 43 
Test III .62 6 51 
Test IV .76 d .66 


Note: The figures below the diagonal are the correlations corrected for 
attenuation. 


varied (e.g., the test length, the test difficulty, and the sam- 
pling of students). The single consistent indication is that 
improvement of the tests has systematically lowered their inter- 
correlations. This may be an artifact of the method of im- 
provement of the tests, since that method could cause each 
test to measure more reliably those items of achievement specific 
to the particular section of the course covered. This finding 
uncovers a problem to be studied in further sets of data. 
Table 7 shows the means, standard deviations, and inter- 
correlations for the revised true-false, multiple-choice, and 
completion totals. Remembering the increases in the reliability 
coefficients, it is gratifying to note that the standard deviations 
of scores on the true-false and multip!e-choice totals have been 
increased by the simple procedures of revision. Since the com- 
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TABLE 7 


Means, Standard Deviations, and Intercorrelations of Scores for True- 
False, Multiple-Choice, and Completion Totals Based upon Four Revised 
Tests. Data from 125 Students. 


145 
Multiple- 
Choice 
questions 


153 
True-False 
questions 


100 
Completion 
questions 





Mean 120.14 95.81 61.94 
8.D. 9.96 11.78 9.20 
Correlations: 

True-False .67 

Mult.-Choice 88 

Completion 2 .88 


Note: The values given below the diagonal are the correlations cor- 
rected for attenuation. 


pletion sections were least altered, the slight drop in reliability 
may be attributable to chance variation. The slightly lowered 
coefficient for the completion total is still adequate. The 
inter-correlations between the true-false, multiple-choice, and 


completion test totals remain much as they were prior to the 
revision of the tests. 


V. SUMMARY AND CONCLUSIONS 


An empirical study has been made of some theoretical and 
some practical procedures for increasing the reliability of 
classroom tests. For this purpose a new and untried set of 
tests constructed under ordinary teaching conditions was 
selected. Reliability coefficients and intercorrelations were 
computed for the tests, using data from a large class. A very 
simple, practical, and non-technical procedure of revision was 
applied to the tests. Measures of reliability and of intercor- 
relations were again computed using data from application of 
this revised set of tests to another large class. The data, when 
examined in the light of recent studies, lead to the following 
conclusions : 
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1. A set of new-type examinations constructed under typical 
classroom conditions furnishes a reliable basis for marking. 
The total scores in the set of tests used in this study yielded 
reliability coefficients above .90. 

2. The fluctuation in reliability of such tests indicates that 
the practice of revision is highly desirable. This is especially 
true in situations where fewer tests are administered in a 
course. 











3. Elimination of one yery unreliable examination from a 
set of four tests resulted in a lowering of the reliability of the 
total. Furthermore, it lowered the efficiency of the composite 
as measured by the reliability per 100 questions. 

4. The inter-correlations between true-false, multiple-choice, 
and completion tests covering the entire course tend to be 
high. The range of such correlations is small. The average 
value is .68 for raw correlations and .84 for correlations cor- 
rected for attenuation. 

















5. The results are in agreement with those of other investi- 
gators in showing that completion tests tend to be more reli- 
abfe than recognition-tvpe tests composed of the same numbers 

kof questions. The marked superiority of the completion test 
method is indicated by the fact that the completion tests were 


—— eel ~~ 
more reliable ey g er tests contained many more 
eoemner 


items. 




















“6. Prior_toresision,recognition tests tend to include many 
items which are too-seasy, while_completion tests tend to in- 
clude_a few items which are too hard, Revision according to a 
stated set of rules tended to make the true-false and multiple- 
choice tests harder and the completion tests slightly easier. 

7. Ney-type tests can be shortened gud at the same time 
improxed in reliability, through a technique of simple revision ; 
but the data reported here-suggest that in a typical set of tests 
the mere omission of unreliable parts is not adequate to 


increase the reliabilities. The results suggest that any marked 
increaseg_in reliability will ordinarily require the inclusion of 
Pesstpeectcon > 


new material. 
[ee 


























HAROLD D. CARTER AND AILEEN P. CRONE 
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NEWS AND NOTES 


The customary luncheon of the Psychological Corporation during the 
A.P.A. meetings will be held this year on Wednesday, September 4, at 
Pennsylvania State College. All Research Associates of the Corporation 
and others interested in its activities are invited to attend. 


At the fifteenth meeting of the Midwestern Psychological Association 

the University of Chicago last May, Dr. Warren R. Baller, professor 
of Educational Psychology at the University of Nebraska, presented a 
paper on ‘‘ The Present Social Status of a Group of Adults Who, When 
lhey Were in School, Were Judged to be Dull in Mental Ability.’’ In 
his study Dr. Baller examined 307 ‘‘dull’’ subjects, 307 of average in- 
telligence and 206 persons with ‘‘markedly deficient mentalities,’’ all 
living in Lincoln, Nebraska, The average age of the groups studied was 
264 years; 9.8 per cent of the ‘‘dull’’ persons had been divorced, whereas 
in the ‘‘normal’’ group the figure was 4.4 per cent. Violations of the 
law, judged by court records and institutional commitment, occur from 
three to seven times more often among the dull subjects compared with 
persons of average intelligence; 70 per cent of the dull subjects were 
found to be wholly self-supporting, as compared with 84 per cent of those 
of normal intelligence. Dr. Baller stressed the need for extension of 
educational programs presented by public schools to include greater atten- 
tion to the needs of ‘‘dull’’ students as a partial remedy for unemploy- 
ment, delinquency and divorce. 


The Safety Research Planning Committee of the New York University 
Center for Safety Education recently published a report of a comprehen- 
sive program of safety research designed to help determine basic facts 
about the causes and control of accidents. The report, ‘‘ Research Needs 
in Safety Education,’’ according to Dr. Ned H. Dearborn, dean of the 
University’s Division of General Education, was prepared ‘‘for the gen- 
eral guidance of educators and officials of safety organizations in imple- 
menting the attack on the accident problem through school and non-school 
agencies.’’ Asserting that ‘‘an accident is a complex of physical con- 
ditions, social environment, and the susceptibilities of individuals,’’ the 

mmittee mapped out four fields of investigation in problems of safety 
education: (1) basic research in the social and psychological background 

f safe and unsafe behavior, with emphasis on individual differences; (2) 
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surveys of the present status of safety education; (3) experimental in 
vestigation of alternative procedures in safety education, with the aid of 
specially developed tests and measurements; and (4) adjustment of edu 
cational practices for the most effective use of subject matter. Educa 
tion for safety means the modification of behavior that may be deep-rooted 
and the laying of the foundation for future behavior. 


The Journal of Consulting Psychology for March-April 1940 contains 
a most enlightening article by John G. Darley and Ralph Berdie entitled 
‘“The Fields of Applied Psychology.’’ The authors’ purpose was to 
identify the people engaged in the various branches of applied Psychology 
and to discover their duties, qualifications, training and other professional 


characteristics. A questionnaire was sent to 3,097 psychologists. To 
secure the names of these 3,097 applied psychologists the membershi; 
lists of twenty-four organizations, totaling 5,198, were used. In addition, 
appeals for the names of applied psychologists who were not affiliated 
with any of the professional psychological organizations were sent to 124 
psychologists throughout the country, 227 school superintendents, the 
chief educational officer in each state, 573 directors of hospitals, and 136 
wardens of state and federal prisons. Eleven hundred twenty-four ques 
tionnaires, or 39 per cent, were returned and usable. The respondents 
were asked to classify themselves into one of eight groups with the fol 
lowing results: Clinical, 300; Psychometricians, 81; Consulting, 74; Edu 
cational, 345; Teacher, 182; Industrial, 43; Statisticians and others, 99. 
The returns of the questionnaire with the aid of tables were analyzed in 
regard to duties, academic training, membership in professional associa 
tions, salary, age and academic degrees. It is shown that consulting 
psychologists have the most advanced degrees and are also the highest 
paid group. Industrial psychologists are the best paid of the younger 
group, while teachers are the lowest paid of the older group. An analy 
sis of the data also shows that almost one-half of the whole group of 
3,097 applied psychologists belongs to no professional psychological asso 
ciation; 28 per cent belongs to the American Psychological Association, 
and only 13 per cent belongs to the American Association for Applied 
Psychologists. If the returns are representative, the qualifications re 
quired for membership in the latter organization, existing especially for 
applied psychologists, admit slightly more than one-third of them—a 
somewhat paradoxical situation which might well be investigated further 
and may well suggest greater effort on the part of the Association to 
secure as members these persons who are really doing so much in applied 
psychology. 


Henry Ford has launched a ‘‘ National Youth Movement’? of his own. 
In a small booklet, mailed to 25,000 other industrialists and manufac 
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turers, he points out the opportunity open to every employer of labor to 
help in solving what he terms ‘‘our gravest national problem—youth 
unemployment.’’ He believes that if each private business and industrial 
management would start five or twenty or a hundred boys, according to 
its ability, in a trade school or on the land, on a basis that permits youth 
to retain its self-respect, youth unemployment would soon be ended. 

The booklet explains Mr. Ford’s ideas on how to meet the situation 
constructively and gives the results of his own experiments along this 
line at Camp Legion and Camp Willow Run, respectively at Dearborn 
and Ypsilanti, Michigan, At each camp is a group of 65 boys, most of 
whom are sons of dead or disabled veterans. Some were homeless; most 
of them undernourished. All had previously been unemployed. To these 
youth camps Mr. Ford gives his personal, daily attention. Here, under 
competent men, they are taught the fundamentals of agriculture. They 
plant, cultivate and harvest many kinds of garden crops on the two 320- 
acre tracts provided by Mr. Ford. The products are sold at wayside 
stands. The boys sleep in tents, have their meals in a special mess-hall 
and each is paid $2.00 a day. At the end of the season the cash balance 
remaining after payment of operating expenses is divided among the 
boys equally. Last season each boy received $128.00 in addition to his 
wages. When the camps close in November the boys are not simply 
dropped back on the street corner but are given the opportunity to enter 
the Henry Ford Trade School or to go directly into one of the Ford plants. 

In addition to the opportunity given these boys to earn and learn at 
the same time under such favorable conditions are the lessons learned in 
cooperativeness, self-reliance and self-respect. 


The Neuro-Psychiatrie Institute of the Hartford Retreat has recently 
issued its one hundred and sixteenth annual report for the year ending 
April 1, 1940. In his report, the Psychiatrist-in-chief, Dr. C. Charles 
Burlingame, speaks of the present social revolution as well as the revolu- 
tion in our concept of psychiatry and the problem of mental and nervous 
disease. New methods are replacing the old. The old type of mental 
hospital is giving way to the progressive medical and educational insti- 
tution. Under his leadership during the past nine years the faculty of 
teachers and therapists has grown to number sixty—approximately one 
to every five patients. According to Dr. Burlingame, ‘‘a well-balanced 
life depends upon four supporting timbers: first, practical vocational 
skill and discipline; second, the development of healthy avocational in- 
terests; third, sound social and recreational practices and fourth, satis- 
factory physical educational habits. Experience has taught us that an 
individual having all four developed and in daily use has an excellent 
prospect of mental and physical health.’’ He further states that psy- 
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chiatric interest cannot be divorced from the social forces which are 
direct factor in the production of mental and nervous illnesses nor fron 
other branches of general medicine because of the merging of the physi 
cal and psychological aspects. The psychiatrist must be a well-equipped 
physician, an educator and a sociologist. Emphasis is given to the facet 
that ‘‘a psychiatric institution among other things must be a specialized 
and extremely practical type of educational institution.’’ 

Among other points most of which must necessarily be omitted, Dr. 
Burlingame mentions the chaotic condition developing from the new and 
irrational attitude toward age groups. Today ‘‘youth’’ is a ‘‘ flexible, 
elastic term applied to anyone under thirty, and we find legislation intro 
duced which calls on state governments to aid ‘youth’ under this ag 
Next we find ‘old age’ arbitrarily set at sixty-five. At sixty-five man is 
supposed to have reached the terminal stage set by Shakespeare and at 
that age one is out of the picture, sans teeth, sans hair, sans everything 
but his social security benefits. Still more amazing is the realization 
that one’s productive period has arbitrarily been limited to the age of 
forty. ... The ‘seven ages of man’ have undergone drastic changes: 
now there are but four: Dependency, extending to thirty, during which 
time man is a ‘youth’; a hectic, productive decade, almost as brief as 
the life of the bee, during which man is supposed to toil and gather honey 


not only for himself but for the drones in the hive as well; reaching 


forty, he arrives at an uncertain no-man’s land with no perquisites, no 
honors, no security.’’ Every practicing psychiatrist knows that these 
unnatural and arbitrary distinctions contribute to psychiatric conditions 

Again, while there is no question about our willingness to give our 
young people an education or provide for their future, the mistake has 
often been made in asking them ‘‘ what are you getting out of it,’’ instead 
of asking ‘‘what are you giving or learning to give to society?’’ Dr 
Burlingame feels that the student should not be encouraged to acquire a 
conviction that the world owes him a living without commensurate effort 
on his part. 
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+, RHINE, J. B., Smiru, B. M., Stuart, C. E., and GREENWOOD, 


Extra-Sensory Perception after Six Years. New York: Henry 

Holt and Company, 1940. Pp. xiv + 463. 
Dr. Rhine and his collaborators from the staff of the Parapsychology 
Laboratory at Duke University describe the mathematical and experi 


mental methods used in E.S.P. research, the results of E.S.P. tests at 


Duke and elsewhere, and the criticisms offered thereof. After systemati 


cally disposing of the various counter-hypotheses advanced to explain the 
results in other than an extra-sensory fashion, the authors consider the 
conditions, physical and psychological, which may influence E.S.P. and 
discuss its possible nature. Investigations conducted since 1934, the date 
of publication of Rhine’s first monograph, are stressed; they represent a 
marked methodological advance over those reported in the earlier publi- 


cation. 


The manuscript was sent for critical evaluation to seven persons who 


were considered to have made the most penetrating and impressive critical 
treatment of the research in this field; of the seven, four declined the in- 
vitation; of the three others, two wrote unfavorable comments, and one, 
favorable comments. These evaluations are printed in chapter 8. De- 
spite this pre-publication reception, and despite the fact that a survey in 
1938 of 603 psychologists (352 answered) indicated that only five accepted 
E.S.P. as an established phenomenon and twenty-six as a likely possibility, 
the authors feel that the situation is by no means discouraging (p. 356). 
On page 100 there is a summary of the results obtained in the six years, 
1934 to 1939 inclusive, in 21 investigations, mostly, but not exclusively by 
Rhine and his associates, in which there were ‘‘special safeguards against 
error’’ and ‘¢ Except for two relative short 


series of trials, the average number of hits per run (25 trials) in these 
ae 


udies varies from 4.88 to 7.38 where average chance is 5.00. The 21 
studies represent a grand total of 468,630 trials with an average of 5.5 
hits per run, 


exclusion of sensory cues.’’ 


This compares with an average of 7.1 hits per run for the 
xperiments reported by Rhine in his first monograph under less controlled 
conditions. On pages 102, 103, and 105 are summarized the results se- 
red under the most rigorous conditions used so far, such as independent 

ording, official record sheets, and two experimenters; the average num- 
ber of hits per run is now 5.2 Whether the average will drop to 5.0 
chance) when all the most recently suggested safeguards are employed, 
our reviewer does not care to predict. 
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Though this book is not likely to win converts from among persons 
trained in psychological research, it may well increase the belief in E.S.P. 
among non-psychological scientists and the laity. This is important for 
the immediate future of E.S.P. because if there is much popular interest 
there will be many tests of E.8.P. and if there are enough tests there are 
bound occasionally to be some with results well above average chance 


expectations of 5.0 hits per run and those results will be sent to the 
Journal of Parapsychology for publication. 
CLARENCE LEUBA, 
Antioch College 


GUILFORD, J. P., Editor. Fields of Psychology. D. Van Nostrand (Co., 
1940. 687 pp. 

This is a big book. It is a book that includes the major ideas in con 
densed form of a number of important specialists in various fields of 
psychology. It is a big book literally and figuratively, and in consequence 
has many good points and also many faults. 

Some of the faults are faults of content. Much material included in 
a book of this type is necessarily overfamiliar and inconsistent in its r 
lationship to the other parts of the book. Since each chapter is an ex 
pression of one man’s point of view, the task of editing thus becomes of 
major importance. In this respect, the editor has undoubtedly done as 
well as possible with such varied material. 

As a survey of a field that of child psychology is best done. It has a 
division into areas and a view of the child from each of the vantage points. 
Not only is this field divided into four segments, but each of the first three 
is subdivided. All this is presented in a delightful style and cannot fail 
to attract anyone with the slightest interest in child psychology. 

The chapters on social psychology concern themselves with fundamental 
concepts and methods, crowds and nationalism. As a unit these are well 
done and are certain to give the student a broad acquaintance with this 
field. The author uses very good description and illustration where, in 
the first of these chapters, he takes up in succession social groupings, 
social interaction and the methods used in social psychology. However, 
his second and third chapters are even better. It is surprising that the 
social psychologist can find so many pertinent remarks to make about 
nationalism. With the present state of world affairs, this chapter stands 
out, perhaps, as the most appropriate, most fashionable, and thus impor 
tant, one in the book. 

The section dévoted to abnormal psychology contains sixteen case de- 
scriptions exemplifying the variations in behavior. Here one finds the 
customary divisions into minor and major, functional and organic ab 
normalities. There is a discussion of the significance in terms of the 
extent, cost and meaning of the subject dealt with here. Although the 
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author has done a good job in compressing a lot of material into three 
chapters, it constitutes, after all, only a miniature abnormal psychology. 

The material on differential psychology presents a good view of the 
difficulties of determining which characteristics of the personality are 
immutable and which are not. Constitutional types and heredity and 
environment are the chief topics to supply the evidence here. In the 
second chapter of this field the questions occur ef who shall be compared 
and how. In the direction of answering these the author discusses prob- 
lems of sampling and of measurement. She gives a thoroughly adequate 
treatment of these subjects. 

In reviewing a book as large as this, it is impossible to cover all the 
contributions. Those which have been singled out for specific discussion 
have seemed to be the most meritorious. The other contributions vary 
greatly in scope and in presentation, but it is possible that another re- 
viewer might find them of greater interest than the ones selected above. 

The references at the conclusion of chapters are excellent as guides to 
further reading and, in addition, the bibliographical material at the end 
of the book is so organized that material relating to any subject may be 
readily discovered. The book as a whole covers all the major fields of 
psychology. It is designed primarily as a second semester textbook 
and will probably fulfill its purpose for such semi-advanced students 
admirably. 

K. W. OBERLIN, 
University of Delaware 


BUHLER, CHARLOTTE. The Child and His Family. New York: Harper 
and Brothers, 1939. viii+187 pp. 


This book claims significance through its attempt, ‘‘to apply exact 
methods to problems which have hitherto been approached only descrip- 
tively.’’ It presents an analysis of data from eight families, concerning 
the behavior and detailed conversations of children, recorded in family 
situations, The material was collected over a period of several weeks by 
observers who endeavored to fit themselves as naturally as possible into 
the family situation. The analytic method consists of tabulating types 
of responses in terms of per cent of all recorded behavior as, for example, 
means by which the child establishes contact with other people in his 
environment. These percentages are presented in bar graph form in the 
fifty tables included in the book. 

The method and scope are given in the introduction. Chapter I deals 
with general aspects of parent-child relationships, Chapter II with the 
individual families, Chapter III with general aspects of sibling relation- 
ships, Chapter IV with the individual sibling pairs, and Chapter V with a 
summary of sibling relationships. Some 120 pages (Chaps. II, III and 
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IV) are devoted largely to the actual data collected. These sections make 
interesting and informative reading which is only marred by the occasional 
superficiality of the interpretation which is consistently attempted. On 
major difficulty with the interpretation is the neglecting of factors of 
training prior to the observation period. The analytic method presented 
is not as new as the author suggests and appears to add little, if anything, 
to either the scientific or purely ‘‘intuitive’’ understanding of the m 
rial presented. 


A short appendix by Sophie Gideon treats the problems of disobedien: 
by presenting correlations, based on eight cases, of frequency and type of 


disobedient behavior in relation to age, and nature of home discipline, 

In spite of the author’s claim of methodological significance the main 
contribution of this book lies in the case material which makes a valual 
addition to the published data in the field of child psychology. 

J. B. Rorrer, 

Indiana University 

Carr, L. J.. VALENTINE, M. A., and Levy, M. H. Integrating the Cam 

the Community, and Social Work. New York: Association Press 
1939. xix 220 pp. 

That the treatment of behavior problems in children has been attacked 
from many angles is well illustrated in Carl Rogers’ excellent summary of 
treatment methods. There appear to be two general types of treatment 
theory. One of these directs attention to the individual himself, i.e, so 
called psychotherapy. The second theory directs attention to the environ 
ment in which the child is living or may be placed. The latter, or en 
vironmental manipulation method, is the one most widely favored by 
sociologists. In both of these methods the emphasis has been primarily 
on dealing with individuals. Obviously this is time-consuming and ex 
pensive.“ As a result a number of attempts have been made at so-called 
group therapy. 

The authors of the present book report a three-year study carried out 
under the title of the Ann Arbor Boy’s Guidance Project. Approximately 
90 boys were selected on the recommendation of teachers and social agencies 
because they presented some type of problem. These boys were sent to a 
summer camp in 1935. Here a careful study was made, and the counselors 
became closely acquainted with the children. As far as possible the sam 
counselors kept in touch with the boys during the following two or threé 
years. While the counselors made contacts with the families and helped 
sach boy in some of his problems, a great deal of effort was devoted to 
dealing with the boys in groups. It is impossible here to review even 
briefly the wealth of information reported in the monograph. Some at 
tempt was made to contrast the results of the social therapy program by 
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curing data on a similar size group of boys living in the same neighbor 


hoods who were not given any therapeutic attention. 
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The conclusions as to the efficiency of this method of social therapy are 


mewhat disappointing. The authors are very cautious in all their inter- 
tions. They find that the children could be divided into those whose 


blem behavior was due to faulty training and those in whom it is due 


motional maladjustment. In the former group the method of social 


ipy was definitely more successful than for the latter group. Apart 


rom any question of the efficiency of treatment the demonstration of these 


) kinds of problem causes is itself important. A similar distinction has 


een elsewhere made by the present reviewer who has felt that sociologists 
have tended to consider all problems as though they were all of the first 


rt, while psychiatrists deal with them as though they were all of the 
cond sort. 
While this monograph does not afford clear-cut evidence in favor of 
ial therapy, it is none the less a very significant contribution. On an 
rimental basis an attempt has been made to integrate all of the 
cilities available in a community for the purpose of delinquency preven 
on. While the attempt may not have been entirely successful it blazes a 
il which others might follow. The authors recognize the shortcomings 
their whole program and point out ways in which subsequent work could 
nprove upon it. The present reviewer as a psychologist is a little dis 


mayed at the apparent complete lack of psychological cooperation in the 
project. Maybe this has not been too serious a lack! He would, however, 
recommend the book most strongly to those of his psychological colleagues 


ho are concerned with the problems of child guidance. 
C. M. Lourtir, 
Indiana University 
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